Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/upstream update #16

Merged
merged 770 commits into from
May 31, 2021
Merged

Feature/upstream update #16

merged 770 commits into from
May 31, 2021

Conversation

langecode
Copy link
Member

No description provided.

dependabot bot and others added 30 commits April 13, 2021 09:41
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.6.5 to 0.6.6.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](tokio-rs/tokio@tokio-util-0.6.5...tokio-util-0.6.6)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/cache](https://github.com/actions/cache) from v2.1.4 to v2.1.5.
- [Release notes](https://github.com/actions/cache/releases)
- [Commits](actions/cache@v2.1.4...1a9e213)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* fix(datadog service): Document region parameter

Closes vectordotdev#7079

This was added to the shared Datadog documentation in
vectordotdev#4174 but never added
specifically to the sinks.

Signed-off-by: Jesse Szwedko <[email protected]>

* Add missing cue config for region

Signed-off-by: Jesse Szwedko <[email protected]>
…ev#7047)

* Delete socket

Signed-off-by: ktf <[email protected]>

* Add tests

Signed-off-by: ktf <[email protected]>

* Remove spaces

Signed-off-by: ktf <[email protected]>
Bumps [reqwest](https://github.com/seanmonstar/reqwest) from 0.11.2 to 0.11.3.
- [Release notes](https://github.com/seanmonstar/reqwest/releases)
- [Changelog](https://github.com/seanmonstar/reqwest/blob/master/CHANGELOG.md)
- [Commits](seanmonstar/reqwest@v0.11.2...v0.11.3)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ectordotdev#7081)

* fix bug:clickhouse sink incorrectly encodes arrays

Signed-off-by: shumin <[email protected]>

* Add test for clickhouse arrays

Signed-off-by: Jesse Szwedko <[email protected]>

Co-authored-by: shumin <[email protected]>
Co-authored-by: Jesse Szwedko <[email protected]>
…vectordotdev#7111)

* fix(docker_logs source): Docker logs missing when container has a tty

Enum bollard::container::LogOutput is Console when docker started
with a tty. LogOutput::Console was missing in the new_event of a
stream value received

https://docs.rs/bollard/0.10.1/bollard/container/enum.LogOutput.html

Fixes vectordotdev#5903

Signed-off-by: Jean Prat <[email protected]>
This advisory seems to no longer be valid.

Closes: vectordotdev#6223

Signed-off-by: Jesse Szwedko <[email protected]>
…#7127)

Ensure `except_fields`, `only_fields`, and `timestamp_format` appear on
all sinks supporting `encoding`. Previously they only appeared on sinks
that had healthchecks.

Fixes: vectordotdev#6949

Signed-off-by: Jesse Szwedko <[email protected]>
…tdev#7133)

* chore(ci): Add advisory RUSTSEC-2021-0013 back to  deny.toml

I neglected to enable all features. The `wasm` dependencies still depend
on an affected version of rust-cpuid.

This reverts commit f30d4c6.

Signed-off-by: Jesse Szwedko <[email protected]>

* Run cargo deny in CI if deny.toml changes

Signed-off-by: Jesse Szwedko <[email protected]>
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.48 to 0.1.49.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](dtolnay/async-trait@0.1.48...0.1.49)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* fix(ci): Ensure all benchmark artaficts are included

CRITERION_HOME having "s was causing it to put it in the wrong place
(locally it created a directory called ").

Also just upload `./target/criterion` since Github does the
zipping for us. We did need to zip when previously downloading the
artifact in the same workflow as this was not zipped yet and so we were
hitting limits trying to individually pull a very large number of files.

Signed-off-by: Jesse Szwedko <[email protected]>
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.4.0 to 1.5.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](tokio-rs/tokio@tokio-1.4.0...tokio-1.5.0)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [notify](https://github.com/notify-rs/notify) from 4.0.15 to 4.0.16.
- [Release notes](https://github.com/notify-rs/notify/releases)
- [Changelog](https://github.com/notify-rs/notify/blob/v4.0.16/CHANGELOG.md)
- [Commits](notify-rs/notify@v4.0.15...v4.0.16)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* fix(remap): Preserve type defs when assigning fields

Previously, remap would overwrite the type def of `.` whenever a new
field was assigned. That is:

```
.foo = 5
.bar = 6
```

Would result in the compiler only having type info for `.bar`.

This change causes it to merge the typedefs whenever a list of path
segments appears.

Signed-off-by: Jesse Szwedko <[email protected]>
vectordotdev#7140)

* fix(kubernetes_logs source): Always use file checkpoints if they exist

The `kubernetes_logs` source exposes a `PathProvider` that breaks one of
the `FileServer`s assumptions that all available files will be listed at
Vector startup time. Instead, the files are only returned once the k8s
metadata is available to the `kubernetes_logs` source. This caused the
`FileServer` to ignore any checkpoints that existed for these files.

As a short-term fix, we just always use the checkpoint, if available,
for any new files that are seen. This fixes the case for the
`kubernetes_logs` source where they are seen as "new" after start-up.

vectordotdev#6564 exists to test this
behavior, but it seems to pass even without this change, so that test
will need to be updated.

Signed-off-by: Jesse Szwedko <[email protected]>
…vectordotdev#7091)

Fixes vectordotdev#7044

The docs and examples indicated that it should default to case
sensitive; however, `contains`, `starts_with`, and `ends_with`,
defaulted to case insensitive matching.

Signed-off-by: Jesse Szwedko <[email protected]>
…ordotdev#7138)

* fix(compression): Switch to MultiGzDecoder instead ef GzDecoder

Fixes vectordotdev#7061

It appears that AWS's ALB logging gzip's multi-part files which we were
only reading the first part of. I tested that `MultiGzDecoder` works on
simple gzip files so I figured we should switch to it everywhere. It was
already being used by the `file` source.

Signed-off-by: Jesse Szwedko <[email protected]>

* Add test for multi-part zst files

Signed-off-by: Jesse Szwedko <[email protected]>
* chore(performance): Fix remap benches

I broke them in vectordotdev#7118

Signed-off-by: Jesse Szwedko <[email protected]>
I backed out a few changes:

```
Upgrading grok v~1.0.1 -> v1.2.0
Upgrading async-graphql-warp v=2.6.4 -> v2.8.2
Upgrading once_cell v1.3 -> v1.7.2
Upgrading async-graphql v=2.6.4 -> v2.8.2
```

Due to incompatibilities in package dependencies.

I backed out

```
Upgrading db-key v0.0.5 -> v0.1.0
```

Because it requires code changes.

Signed-off-by: Jesse Szwedko <[email protected]>
Bumps [rust_decimal](https://github.com/paupino/rust-decimal) from 1.10.3 to 1.11.0.
- [Release notes](https://github.com/paupino/rust-decimal/releases)
- [Commits](paupino/rust-decimal@1.10.3...1.11.0)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…tordotdev#7152)

Fixes: vectordotdev#5716

Previously labels on docker containers were inserted in such a way that
dots in their names would end up creating a nested structure in the log
event due to the semantics of `LogEvent.insert`.

For example:

```json
{
  "container_created_at": "2021-04-16T18:53:19.946155600Z",
  "container_id": "d6bd69d4bc64bef20b4e992dcc23113741067e8268762694f92899504ae14319",
  "container_name": "docker_echo_1",
  "host": "COMP-C02DV25MML87",
  "image": "hashicorp/http-echo:latest",
  "label": {
    "com": {
      "docker": {
        "compose": {
          "config-hash": "e7e5ba19811180f27a7af36667652d0cd686599e6184cb023d9b71d791ff6a1e",
          "container-number": "1",
          "oneoff": "False",
          "project": {
            "config_files": "docker-compose.yml",
            "working_dir": "/private/tmp/docker"
          },
          "service": "echo",
          "version": "1.27.4"
        }
      }
    }
  },
  "message": "2021/04/16 19:14:10 localhost:5678 172.29.0.1:61824 \"GET / HTTP/1.1\" 200 6 \"curl/7.64.1\" 35.6µs",
  "source_type": "docker",
  "stream": "stdout",
  "timestamp": "2021-04-16T19:14:10.400790400Z"
}
```

This change ensures that labels are inserted as-is as keys:

```json
{
  "container_created_at": "2021-04-16T18:53:19.946155600Z",
  "container_id": "d6bd69d4bc64bef20b4e992dcc23113741067e8268762694f92899504ae14319",
  "container_name": "docker_echo_1",
  "host": "COMP-C02DV25MML87",
  "image": "hashicorp/http-echo:latest",
  "label": {
    "com.docker.compose.config-hash": "e7e5ba19811180f27a7af36667652d0cd686599e6184cb023d9b71d791ff6a1e",
    "com.docker.compose.container-number": "1",
    "com.docker.compose.oneoff": "False",
    "com.docker.compose.project": "docker",
    "com.docker.compose.project.config_files": "docker-compose.yml",
    "com.docker.compose.project.working_dir": "/private/tmp/docker",
    "com.docker.compose.service": "echo",
    "com.docker.compose.version": "1.27.4"
  },
  "message": "2021/04/16 19:12:01 localhost:5678 172.29.0.1:61820 \"GET / HTTP/1.1\" 200 6 \"curl/7.64.1\" 18.1µs",
  "source_type": "docker",
  "stream": "stdout",
  "timestamp": "2021-04-16T19:12:01.622769500Z"
}
```

Signed-off-by: Jesse Szwedko <[email protected]>
Bumps [gouth](https://github.com/mechiru/gouth) from 0.2.0 to 0.2.1.
- [Release notes](https://github.com/mechiru/gouth/releases)
- [Commits](mechiru/gouth@v0.2.0...v0.2.1)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.49 to 0.1.50.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](dtolnay/async-trait@0.1.49...0.1.50)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This commit introduces a new RFC for extracting the core of vector. Closes vectordotdev#7027.

Signed-off-by: Brian L. Troutwine <[email protected]>
jszwedko and others added 28 commits May 26, 2021 10:07
…ordotdev#7487)

This is the documented behavior but we were not honoring it.

Signed-off-by: Jesse Szwedko <[email protected]>
This commit adds an explicit +Unpin to our buffer type definition, introducing
`EventStream` type alias to tidy up some repitition. This is done to support
pull request vectordotdev#7576 where this changed is needed but is a little outside the
scope of that PR. The major change here is in `src/utilization` where I had to
re-implement the select loop to use an explicit stream definition. The `stream!`
macro used previously couldn't make the `+ Unpin` guarantee. It wasn't clear to
me that it could be fixed upstream.

Signed-off-by: Brian L. Troutwine <[email protected]>
…7581)

* fix(cli): include error details in `vector vrl` output

Currently, the actual details of the errors are masked so that users
just see things like: `parse error` and `json error` without additional
detail about what the issue is. This change ensures we display the
source of the error.

I also fixed the help doc for `--input` that was confusing a user as
they thought, understandably, that they could pass a pretty-printed JSON
object as a file; however `vector vrl` expects one JSON object per line.
We can improve this in the future to actually allow a prety printed
object here, but this at least makes the help doc accurate.

Signed-off-by: Jesse Szwedko <[email protected]>
…dev#7264)

* Early compact

Signed-off-by: ktf <[email protected]>

* Exact compaction

Signed-off-by: ktf <[email protected]>

* Add comments

Signed-off-by: ktf <[email protected]>

* More comments

Signed-off-by: ktf <[email protected]>

* More comments

Signed-off-by: ktf <[email protected]>

* Remove dashes

Signed-off-by: ktf <[email protected]>

* Add timeout

Signed-off-by: ktf <[email protected]>

* Clippy

Signed-off-by: ktf <[email protected]>

* Move changes

Signed-off-by: ktf <[email protected]>
Add initial implementation of `redact` to VRL.

Signed-off-by: Jesse Szwedko <[email protected]>
Bumps [rust_decimal](https://github.com/paupino/rust-decimal) from 1.14.0 to 1.14.1.
- [Release notes](https://github.com/paupino/rust-decimal/releases)
- [Commits](paupino/rust-decimal@1.14.0...1.14.1)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [semver](https://github.com/dtolnay/semver) from 0.11.0 to 1.0.0.
- [Release notes](https://github.com/dtolnay/semver/releases)
- [Commits](dtolnay/semver@0.11.0...1.0.0)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…otdev#7611)

Bumps [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action) from 1.1.0 to 1.2.0.
- [Release notes](https://github.com/docker/setup-qemu-action/releases)
- [Commits](docker/setup-qemu-action@v1.1.0...v1.2.0)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…otdev#7612)

Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 2.4.0 to 2.5.0.
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](docker/build-push-action@v2.4.0...v2.5.0)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…lated changes (vectordotdev#7602)

* chore(dev): remove 'built' dep and stop needless rebuilding on unrelated changes

Signed-off-by: Toby Lawrence <[email protected]>
Bumps [actions/cache](https://github.com/actions/cache) from 2.1.5 to 2.1.6.
- [Release notes](https://github.com/actions/cache/releases)
- [Commits](actions/cache@v2.1.5...v2.1.6)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Commit a07b55c didn't include all the
changes to `Cargo.lock` caused to removal of `semver` crate.

Signed-off-by: Bruce Guenter <[email protected]>
…v#7576)

This commit moves the top-level benchmarks of buffers into the core buffers crate. This is not a simple code move. There are two fundamental changes being proposed here: introduction of quickcheck tests and intentionally spare benchmarks.

## QuickCheck 

The first major change in this PR are the newly introduce QuickCheck tests for buffers, both for in-memory and on-disk buffers. The test is simplistic: every `T` that is sent into the buffer should come back out in order unless the buffer was over-full, in which was the `WhenFull` condition applies. The model we compare the buffer against is a `VecDeque` with logic hung off it to mimic the [`buffers + num-senders`](https://docs.rs/futures/0.3.15/futures/channel/mpsc/fn.channel.html) quirk of futures' mpsc, our own shedding logic. The test loop is done without reference to any external runtime: we use the bare interface of `Sink` and Stream`. This avoids the non-determinism introduced by a runtime and has the happy side-benefit of making buffers more runtime agnostic. Notes are left in the model test indicating how we could expand the model. 

## Intentionally Spare Benchmarks 

The benchmark code follow a similar tactic to the quickcheck tests: they do not use a runtime and instead use the bare `Sink` and `Stream` API. While criterion does have support for running async/await code their documentation notes that this setup will introduce overhead and noise, undesirable in this core serialization point of vector. The benchmarks are done with respect to the in-memory and on-disk buffer variants, crossed by a write-then-read and write-and-read benchmark variation. Every effort has been taken to drive down noise as much as possible on my development machine and the results are promising, though we'll see how things shake out in the CI system. Perhaps controversially the benchmark code _does not_ concern itself with the correctness of the buffer response, except in the most coarse sense. This is in line with the goal of driving down noise, reducing the amount of code running in the measurement loop, but runs a little counter to the project's previous strategy with regard to benchmarks. The intention here is for the quickcheck tests to tackle correctness issues and for the benchmark code to focus solely on measurement. The benchmark measurement loops are simplistic and fail-fast with regard to error conditions. 

Resolves vectordotdev#7458 

Signed-off-by: Brian L. Troutwine <[email protected]>
)

* Introduce `fn Metric::{to,from}_parts`

* Use Metric::into_parts to convert into proto Metric

* Allow for converting events into proto items with metadata

* Add acknowledgement support to vector sink

Signed-off-by: Bruce Guenter <[email protected]>
* Min size

Signed-off-by: ktf <[email protected]>

* Raise to 4MB

Signed-off-by: ktf <[email protected]>

* Add references

Signed-off-by: ktf <[email protected]>
vectordotdev#7618)

* Add `fn {Event,Metric}::with_batch_notifier`

* Add support for acknowledgements to vector source

Signed-off-by: Bruce Guenter <[email protected]>
…rdotdev#7603)

We don't have a good replacement for this yet for all cases, so just remove the deprecation warnings for now. We plan to revisit schema management at some point which will likely involve dropping these.
…7536)

* enhancement(aws_s3 sink): Add acknowledgements support

Signed-off-by: Bruce Guenter <[email protected]>
…ordotdev#7535)

As there are no integration tests for this sink, this is untested.

Signed-off-by: Bruce Guenter <[email protected]>
… creation at runtime (vectordotdev#7601)

* fix(aws_cloudwatch_logs sink): Fix healthcheck to allow for log group creation at runtime

Signed-off-by: Spencer Gilbert <[email protected]>

* Fix clippy violation

Signed-off-by: Spencer Gilbert <[email protected]>

* Both dynamic group names and create_missing_group short circuit healthcheck
Signed-off-by: Spencer Gilbert <[email protected]>

* Always try to describe log group, don't error for dynamic or create on demand
Signed-off-by: Spencer Gilbert <[email protected]>

* Fix event messages to pass linting

Signed-off-by: Spencer Gilbert <[email protected]>

* Update healthcheck log messages

Signed-off-by: Spencer Gilbert <[email protected]>
* Batch deletes

Signed-off-by: ktf <[email protected]>

* Use VecDeque

Signed-off-by: ktf <[email protected]>

* Use take

Signed-off-by: ktf <[email protected]>

* Return unread_size

Signed-off-by: ktf <[email protected]>

* Unpin

Signed-off-by: ktf <[email protected]>
@langecode langecode merged commit cd16104 into master May 31, 2021
@langecode langecode deleted the feature/upstream-update branch May 31, 2021 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.