Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libbeat][reader] - Adding support for parquet reader #35183

Merged
merged 56 commits into from
May 23, 2023
Merged
Show file tree
Hide file tree
Changes from 52 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
ffe109d
initial commit for s3 parquet support
ShourieG Apr 24, 2023
5295efd
updated changelog
ShourieG Apr 24, 2023
0f5b475
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG Apr 24, 2023
b41aa40
added license updates
ShourieG Apr 24, 2023
83598fa
updated notice and go mod/sum
ShourieG Apr 24, 2023
1ad3fe9
Merge branch 'main' into awss3/parquet
ShourieG Apr 24, 2023
f7c5498
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG Apr 24, 2023
ec642f5
removed libgering panic
ShourieG Apr 24, 2023
1664648
added parquet benchmark tests
ShourieG Apr 25, 2023
8f56a5e
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG Apr 25, 2023
4d090a3
updated osquery package due to update in dependant thrift package
ShourieG Apr 25, 2023
b370093
added parquet reader with benchmark tests and implemented that reader…
ShourieG Apr 26, 2023
e8e45af
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG Apr 26, 2023
2ff7b38
addressed linting errors
ShourieG Apr 26, 2023
2d8321b
refactored parquet reader, added tests and benchmarks and addressed p…
ShourieG Apr 28, 2023
cbf864c
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG Apr 28, 2023
42b7d06
addressed pr comments
ShourieG May 2, 2023
8119a06
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG May 2, 2023
6e6687d
resolved merged conflicts
ShourieG May 8, 2023
2c9d32a
resolved merged conflicts
ShourieG May 8, 2023
8c536a4
updated notice
ShourieG May 8, 2023
9b2e330
added more parquet file tests with json comparisons, addressed pr com…
ShourieG May 9, 2023
35df388
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG May 9, 2023
fc9c0c6
removed commented codeS
ShourieG May 9, 2023
ed6edca
removed bad imports & cleaned up tests
ShourieG May 12, 2023
6384a11
updated notice
ShourieG May 12, 2023
47c61a1
added graceful closures with err checks in test
ShourieG May 12, 2023
3049ee5
added graceful closures with err checks in test
ShourieG May 12, 2023
9aa292c
removed s3 parquet implementation from this PR
ShourieG May 12, 2023
b2bb28a
removed s3 parquet implementation from this PR
ShourieG May 12, 2023
7a14816
Update filebeat.yml
ShourieG May 12, 2023
a38c2fc
Update filebeat.yml
ShourieG May 12, 2023
966e124
merged with upstream
ShourieG May 12, 2023
c5e0cdf
updated notice
ShourieG May 12, 2023
89b2db3
Merge branch 'awss3/parquet' of github.com:ShourieG/beats into awss3/…
ShourieG May 12, 2023
1a2ee02
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG May 12, 2023
9791753
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG May 15, 2023
437eb11
addressed PR suggestions
ShourieG May 15, 2023
a708694
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG May 15, 2023
364a554
addressed PR comments
ShourieG May 16, 2023
ec4185b
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG May 16, 2023
110134f
updated godoc comment
ShourieG May 16, 2023
17fb249
addressed PR comments, switched path with filebath
ShourieG May 16, 2023
92162f3
updated CODEOWNERS and addressed PR comments
ShourieG May 18, 2023
57344d2
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG May 18, 2023
db1d3a4
addressed PR comments, added a rand seeding process
ShourieG May 18, 2023
e6da1e3
fixed test seed value to 1
ShourieG May 18, 2023
d32f412
updated comments
ShourieG May 18, 2023
b3e69b5
removed defers in loops
ShourieG May 18, 2023
39ce083
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG May 18, 2023
f4d6019
merged with upstream and resolved conflicts
ShourieG May 19, 2023
339f57d
updated notice
ShourieG May 19, 2023
8c6e7e0
updated godoc comments as suggested
ShourieG May 23, 2023
6c05241
Merge remote-tracking branch 'upstream/main' into awss3/parquet
ShourieG May 23, 2023
766c9da
updated changelog
ShourieG May 23, 2023
1243902
Update x-pack/libbeat/reader/parquet/parquet.go
ShourieG May 23, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -154,3 +154,4 @@ CHANGELOG*
/x-pack/osquerybeat/ @elastic/security-asset-management
/x-pack/packetbeat/ @elastic/security-external-integrations
/x-pack/winlogbeat/ @elastic/security-external-integrations
/x-pack/libbeat/reader/parquet/ @elastic/security-external-integrations
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,7 @@ automatic splitting at root level, if root level element is an array. {pull}3415
- Add nginx ingress_controller parsing if one of upstreams fails to return response {pull}34787[34787]
- Allow neflow v9 and ipfix templates to be shared between source addresses. {pull}35036[35036]
- Add support for collecting IPv6 metrics. {pull}35123[35123]
- Added support for apache parquet files to aws-s3 input. {issue}34662[34662] {pull}35183[35183]
- Add oracle authentication messages parsing {pull}35127[35127]
- Add support for CRC validation in Filebeat's HTTP endpoint input. {pull}35204[35204]
- Add execution budget to CEL input. {pull}35409[35409]
Expand Down
11,383 changes: 6,992 additions & 4,391 deletions NOTICE.txt

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions dev-tools/notice/overrides.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@
{"name": "kernel.org/pub/linux/libs/security/libcap/psx", "licenceType": "BSD-3-Clause", "note": "dual licensed as GPL-v2 and BSD"}
{"name": "github.com/awslabs/kinesis-aggregation/go/v2", "licenceType": "Apache-2.0", "url": "https://github.com/awslabs/kinesis-aggregation/blob/master/LICENSE.txt"}
{"name": "github.com/dnaeon/go-vcr", "licenceType": "BSD-2-Clause"}
{"name": "github.com/JohnCGriffin/overflow", "licenceType": "MIT"}
1 change: 1 addition & 0 deletions dev-tools/notice/rules.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
"BSD-2-Clause",
"BSD-2-Clause-FreeBSD",
"BSD-3-Clause",
"CC0-1.0",
"Elastic",
"ISC",
"MIT",
Expand Down
21 changes: 15 additions & 6 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ require (
github.com/Azure/go-autorest/autorest/azure/auth v0.4.2
github.com/Azure/go-autorest/autorest/date v0.3.0
github.com/Masterminds/semver v1.5.0
github.com/Microsoft/go-winio v0.6.0
github.com/Microsoft/go-winio v0.6.1
github.com/PaesslerAG/gval v1.0.0
github.com/PaesslerAG/jsonpath v0.1.1
github.com/Shopify/sarama v1.27.0
Expand Down Expand Up @@ -99,7 +99,7 @@ require (
github.com/golang/protobuf v1.5.2
github.com/golang/snappy v0.0.4
github.com/gomodule/redigo v1.8.3
github.com/google/flatbuffers v1.12.1
github.com/google/flatbuffers v23.3.3+incompatible
github.com/google/go-cmp v0.5.9
github.com/google/gopacket v1.1.19
github.com/google/uuid v1.3.0
Expand All @@ -126,7 +126,7 @@ require (
github.com/mitchellh/hashstructure v1.1.0
github.com/mitchellh/mapstructure v1.5.0
github.com/olekukonko/tablewriter v0.0.5
github.com/osquery/osquery-go v0.0.0-20210622151333-99b4efa62ec5
github.com/osquery/osquery-go v0.0.0-20220706183148-4e1f83012b42
github.com/otiai10/copy v1.2.0
github.com/pierrre/gotestcover v0.0.0-20160517101806-924dca7d15f0
github.com/pkg/errors v0.9.1
Expand Down Expand Up @@ -188,6 +188,7 @@ require (
cloud.google.com/go/redis v1.10.0
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v0.4.1
github.com/Azure/go-autorest/autorest/adal v0.9.14
github.com/apache/arrow/go/v11 v11.0.0
github.com/aws/aws-sdk-go-v2/feature/s3/manager v1.11.17
github.com/aws/aws-sdk-go-v2/service/cloudformation v1.20.4
github.com/aws/aws-sdk-go-v2/service/kinesis v1.15.8
Expand Down Expand Up @@ -234,8 +235,10 @@ require (
github.com/Azure/go-autorest/logger v0.2.1 // indirect
github.com/Azure/go-autorest/tracing v0.6.0 // indirect
github.com/AzureAD/microsoft-authentication-library-for-go v0.5.1 // indirect
github.com/JohnCGriffin/overflow v0.0.0-20211019200055-46fa312c352c // indirect
github.com/andybalholm/brotli v1.0.5 // indirect
github.com/antlr/antlr4/runtime/Go/antlr v1.4.10 // indirect
github.com/apache/thrift v0.13.1-0.20200603211036-eac4d0c79a5f // indirect
github.com/apache/thrift v0.18.1 // indirect
github.com/armon/go-radix v1.0.0 // indirect
github.com/aws/aws-sdk-go v1.38.60 // indirect
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.4.3 // indirect
Expand Down Expand Up @@ -270,6 +273,7 @@ require (
github.com/go-ole/go-ole v1.2.6 // indirect
github.com/go-stack/stack v1.8.0 // indirect
github.com/gobuffalo/here v0.6.7 // indirect
github.com/goccy/go-json v0.9.11 // indirect
github.com/godror/knownpb v0.1.0 // indirect
github.com/golang-sql/civil v0.0.0-20190719163853-cb61b32ac6fe // indirect
github.com/golang-sql/sqlexp v0.1.0 // indirect
Expand Down Expand Up @@ -298,13 +302,17 @@ require (
github.com/json-iterator/go v1.1.12 // indirect
github.com/karrick/godirwalk v1.17.0 // indirect
github.com/kballard/go-shellquote v0.0.0-20180428030007-95032a82bc51 // indirect
github.com/klauspost/compress v1.15.9 // indirect
github.com/klauspost/asmfmt v1.3.2 // indirect
github.com/klauspost/compress v1.16.5 // indirect
github.com/klauspost/cpuid/v2 v2.0.9 // indirect
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 // indirect
github.com/mailru/easyjson v0.7.6 // indirect
github.com/markbates/pkger v0.17.1 // indirect
github.com/mattn/go-isatty v0.0.14 // indirect
github.com/mattn/go-isatty v0.0.16 // indirect
github.com/mattn/go-runewidth v0.0.9 // indirect
github.com/matttproud/golang_protobuf_extensions v1.0.2-0.20181231171920-c182affec369 // indirect
github.com/minio/asm2plan9s v0.0.0-20200509001527-cdd76441f9d8 // indirect
github.com/minio/c2goasm v0.0.0-20190812172519-36a3d3bbc4f3 // indirect
github.com/mitchellh/go-homedir v1.1.0 // indirect
github.com/mitchellh/iochan v1.0.0 // indirect
github.com/moby/spdystream v0.2.0 // indirect
Expand Down Expand Up @@ -333,6 +341,7 @@ require (
github.com/xdg/stringprep v1.0.3 // indirect
github.com/youmark/pkcs8 v0.0.0-20181117223130-1be2e3e5546d // indirect
github.com/yusufpapurcu/wmi v1.2.2 // indirect
github.com/zeebo/xxh3 v1.0.2 // indirect
go.elastic.co/fastjson v1.1.0 // indirect
go.opencensus.io v0.23.0 // indirect
golang.org/x/exp v0.0.0-20220921023135-46d9e7742f1e // indirect
Expand Down
Loading