Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch partition metricset from client to broker #3029

Merged
merged 6 commits into from
Dec 1, 2016

Conversation

ruflin
Copy link
Contributor

@ruflin ruflin commented Nov 17, 2016

No description provided.

@ruflin ruflin added the in progress Pull request is currently in progress. label Nov 17, 2016
@ruflin ruflin changed the title Switch paritition metricset from client to broker Switch partition metricset from client to broker Nov 17, 2016
@@ -37,57 +36,60 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {
// Fetch partition stats list from kafka
func (m *MetricSet) Fetch() ([]common.MapStr, error) {

if m.client == nil {
if m.broker == nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider allocating/creating m.broker at metricset initialization and m.broker.Connected() to see if broker is connected and reconnect? As Connected only checks if a connection object exists, you can try Heartbeat.

Alternatively (to not keep connection open for too long), use defer m.broker.Close() and always call Open.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the broker part to New(). Undecided which approach I should follow for the open / close part :-)

}

topics, err := m.client.Topics()
response, err := m.broker.GetMetadata(&sarama.MetadataRequest{})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about optional topic names configuration? e.g. metricset.kafka.partitions.topics=[...] with default being empty string. Would this work here?

Question: does the call to GetMetadata return the complete cluster-state or only topics available to the broker it's ID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Topic filter "should" work here but I would keep this also for a version 2. It is already possible to filter topics out, but you would need filters which I think is ok for now.

For the question: TBH I'm not sure. Based on https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-TopicMetadataRequest I would expect that we would get an array for each broker. But we only get a list of topics back.

if err != nil {
logp.Err("Fetching brocker for partition %s in topic %s: %s", partition, topic, err)
}
offsetResponse, _ := m.broker.GetAvailableOffsets(offsetRequest)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does GetAvailableOffsets add the current broker ID to the API request?

Checking the 'encode' method, it seems ReplicaID is always -1. Does it mean, we query offsets from leader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like broker id is not set. Do we actually query the same information as with client? :-(

"partition": common.MapStr{
"id": partition.ID,
"error": partition.Err,
"leader": partition.Leader,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is partition.leader == broker.id if current broker is leader? If so, we wan't to add a 'leader' flag somewhere in the document?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, or at least that is my assumption. I don't hink we need to add a special "boolean" to the event as a query with script filter should be easy to get the above result.

@ruflin ruflin force-pushed the kafka-partition-refactoring branch 2 times, most recently from 39edbf9 to 7940fc9 Compare November 18, 2016 13:01
@ruflin ruflin added Metricbeat Metricbeat review in progress Pull request is currently in progress. and removed in progress Pull request is currently in progress. review labels Nov 18, 2016
@urso urso force-pushed the kafka-partition-refactoring branch from 19a2e8d to 7940fc9 Compare November 23, 2016 18:18
@ruflin ruflin force-pushed the kafka-partition-refactoring branch from 46889cd to 424aa79 Compare November 30, 2016 07:04
@ruflin ruflin added review and removed in progress Pull request is currently in progress. labels Nov 30, 2016
Update kafka broker query

- on connect try to find the broker id (address must match advertised host).
- check broker is leader before querying offsets
- query offsets for all replicas
- remove 'isr' from event, and replace with boolean flag `insync_replica`
- replace `replicas` from event with per event `replica`-id
- update sarama to get offset per replica id
@ruflin ruflin force-pushed the kafka-partition-refactoring branch from 3629d42 to 03cb921 Compare November 30, 2016 09:41
@@ -62,7 +62,7 @@ import:
- package: github.com/miekg/dns
version: 5d001d020961ae1c184f9f8152fdc73810481677
- package: github.com/Shopify/sarama
version: fix/sasl-handshake
version: enh/offset-replica-id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking, does this branch contain the sasl fix?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the branch is directly branched off from sasl-handshake.

description: >
Indicates if replica is included in the in-sync replicate set (ISR).

- name: error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe error_code for clarity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, if we switch this one to error_code, we would have to change the other 2 errors too. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error_code seems a bit more natural to me, but I don't feel to strong about it.

}

if m.id == noID {
b.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If b.Close() is needed here, maybe it's need on the other error branches as well?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only close b if we return nil. That is line 68 a close is missing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the close to line 68

}

if m.id == noID {
b.Close()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only close b if we return nil. That is line 68 a close is missing.

meta, err := b.GetMetadata(&sarama.MetadataRequest{})
if err != nil {
return nil, err
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this call might fail if leader-election is active.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See default sarama settings below. As we do the connect for each call I think it can be ok if one of them fails.

b := m.broker
if err := b.Open(m.cfg); err != nil {
return nil, err
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will open fail if leader-election is active?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. But my thinking here is similar to my comment below. It is ok if it fails from time to time, as we call connect() for each Fetch. I prefer that it fails instead of waiting for a long time.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, waiting is for a few ms. in case we fail, a document will be missing for one period.

}

topics, err := m.client.Topics()
defer b.Close()
response, err := b.GetMetadata(&sarama.MetadataRequest{})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might fail if leader-election is active? consider retry (max 3 times) with waits of a few milliseconds (see sarama settings for defaults)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the defaults which are:

c.Metadata.Retry.Max = 3
c.Metadata.Retry.Backoff = 250 * time.Millisecond

I think we are good with these defaults.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, these defaults are 'internal' to sarama. If you use client/(a)sync-publisher, sarama updates metadata from time to time. With us calling b.GetMetadata directly these settings are not in effect I think.

@urso urso added in progress Pull request is currently in progress. and removed review labels Nov 30, 2016
@urso
Copy link

urso commented Nov 30, 2016

Doing some tests with master + this PR:

  • connect to local broker only, instead of full cluster
    • just one error message in logs if broker is not reachable, with error message including correct IO error (used to be loads of warnings from sarama)
  • fixes issue with metricbeat reporting offsets at -1 if partition is owned by another (potentially unreachable) broker. Change includes reporting topics/partitions only if current broker is leader (limitation by kafka API)
  • fix oldest partition offset being off, by explicitly querying for oldest available offset
  • PR tries to query offsets for each replica.

@urso urso added review and removed in progress Pull request is currently in progress. labels Nov 30, 2016
@tsg
Copy link
Contributor

tsg commented Dec 1, 2016

jenkins, retest it

Copy link
Contributor Author

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left you a few additional comments which can be handled in a second PR later.

@@ -4,3 +4,24 @@
#period: 10s
#hosts: ["localhost:9092"]

#client_id: metricbeat
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove all tehse config options from the short config.

@@ -94,6 +94,27 @@ metricbeat.modules:
#period: 10s
#hosts: ["localhost:9092"]

#client_id: metricbeat

#metadata.retries: 3
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it we should use metadata namespace here, as in the end from a user perspective it is just retrieds. Metadata is like an internal "logic" thing.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense.

0
],
"topic": "testtopic"
"partition": {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs update. I will do a PR as soon as this is merged.

for _, topic := range response.Topics {
evtTopic := common.MapStr{
"name": topic.Name,
"error": common.MapStr{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as error is empty, we should not push it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error is not empty, but has error.code: 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now checking for error.code != 0 and if 0 ignore it.

"broker": evtBroker,
"partition": common.MapStr{
"id": partition.ID,
"error": common.MapStr{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here for the error: If it is empty, we should not add it to the event.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values error.code: 0 is a valid error code in kafka. That's why I did keep it in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, it is a valid error code?!? :-( What is "no" error?

@ruflin ruflin merged commit 4c4889f into elastic:master Dec 1, 2016
@ruflin ruflin deleted the kafka-partition-refactoring branch December 1, 2016 07:14
ruflin added a commit to ruflin/beats that referenced this pull request Dec 1, 2016
Update kafka broker query

- Switch paritition metricset from client to broker
- on connect try to find the broker id (address must match advertised host).
- check broker is leader before querying offsets
- query offsets for all replicas
- remove 'isr' from event, and replace with boolean flag `insync_replica`
- replace `replicas` from event with per event `replica`-id
- update sarama to get offset per replica id

(cherry picked from commit 4c4889f)
tsg pushed a commit that referenced this pull request Dec 1, 2016
Update kafka broker query

- Switch paritition metricset from client to broker
- on connect try to find the broker id (address must match advertised host).
- check broker is leader before querying offsets
- query offsets for all replicas
- remove 'isr' from event, and replace with boolean flag `insync_replica`
- replace `replicas` from event with per event `replica`-id
- update sarama to get offset per replica id

(cherry picked from commit 4c4889f)
suraj-soni pushed a commit to suraj-soni/beats that referenced this pull request Dec 15, 2016
Update kafka broker query

- Switch paritition metricset from client to broker
- on connect try to find the broker id (address must match advertised host).
- check broker is leader before querying offsets
- query offsets for all replicas
- remove 'isr' from event, and replace with boolean flag `insync_replica`
- replace `replicas` from event with per event `replica`-id
- update sarama to get offset per replica id
monicasarbu pushed a commit that referenced this pull request Dec 19, 2016
* Rewrite elasticsearch connection URL (#3058)
* Fix metricbeat service times-out at startup (#3056)
* remove init collecting of processes
* add changelog entry

* Clarify that json.message_key is optional in Filebeat (#3055)

I reordered the options based on importance (I put the optional config setting at the end).

And I changed the wording to further clarify that the `json.message_key` setting is optional.

Fixes #2864

* Document add_cloud_metadata processor (#3054)

Fixes #2791

* Remove process.GetProcStatsEvents as not needed anymore (#3066)

* Fix testing for 2x releases (#3057)

* Update docker files to the last major with the most recent minor and bugfix version
* Renamed files to Dockerfile-2x to not have to be renamed every time a new bugfix is released
* Remove scripts and config files which are not needed anymore

To run testsuite for 2x releases, run: `TESTING_ENVIRONMENT=2x make testsuite`

* Remove old release notes files from packetbeat docs (#3067)

* Update go-ucfg (#3045)

- Update go-ucfg
- add support for parsing lists/dictionaries from environment variables and via
  `-E` flag

* Parse elasticsearch URL before logging it (#3075)

* Fix the total CPU time in the Docker dashboard (#3085) (#3086)

Part of #2629. The name of the field was changed, but not in the dashboard.
(cherry picked from commit e271d9f)

* Switch partition metricset from client to broker (#3029)

Update kafka broker query

- Switch paritition metricset from client to broker
- on connect try to find the broker id (address must match advertised host).
- check broker is leader before querying offsets
- query offsets for all replicas
- remove 'isr' from event, and replace with boolean flag `insync_replica`
- replace `replicas` from event with per event `replica`-id
- update sarama to get offset per replica id

* Make error fields optional in partition event (#3089)

* Update data.json

* Make it clear in the docs that publish_async is still experimental (#3096)

Remove example for publish_async from the docs

* Remove metadata prefix from config as not needed (#3095)

* Remove left over string in template test (#3102)

* Fix typo in Dockerfile comment (#3105)

* Document batch_read_size is experimental in Winlogbeat

* Add benchmark test for batch_read_size in Winlogbeat (#3107)

* Fix ES 2.x integration test (#3115)

There was a test that was loading a mock template, and this template
was assuming 5.x.

* Pass `--always-copy` to virtualenv (#3082)

virtualenv creates symlinks so `make setup` fails when ran on a network mounted
fs. `--always-copy` copies files to the destination dir rather than symlinking.

* Add project prefix for composer environment (#3116)

This prefix is need to run tests with different environments in parallel so one does not affect the other. Like this 2x and snapshot builds should be able to coexist

* Reduce allocations in UTF16 conversion (#3113)

When decoding a UTF16 string contained in a buffer larger than just the string, more space was allocated than required.

```
BenchmarkUTF16BytesToString/simple_string-4         	 2000000	       846 ns/op	     384 B/op	       3 allocs/op
BenchmarkUTF16BytesToString/larger_buffer-4         	 2000000	       874 ns/op	     384 B/op	       3 allocs/op
BenchmarkUTF16BytesToString_Original/simple_string-4         	 2000000	       840 ns/op	     384 B/op	       3 allocs/op
BenchmarkUTF16BytesToString_Original/larger_buffer-4         	 1000000	      3055 ns/op	    8720 B/op	       3 allocs/op
```

```
PS C:\Gopath\src\github.com\elastic\beats\winlogbeat> go test -v github.com/elastic/beats/winlogbeat/eventlog -run ^TestBenchmarkBatchReadSize$ -benchmem -benchtime 10s -benchtest
=== RUN   TestBenchmarkBatchReadSize
--- PASS: TestBenchmarkBatchReadSize (68.04s)
        bench_test.go:100: batch_size=10, total_events=20000, batch_time=5.682627ms, events_per_sec=1759.7494961397256, bytes_alloced_per_event=44 kB, total_allocs=4923840
        bench_test.go:100: batch_size=100, total_events=30000, batch_time=53.850879ms, events_per_sec=1856.9799018508127, bytes_alloced_per_event=44 kB, total_allocs=7354285
        bench_test.go:100: batch_size=500, total_events=25000, batch_time=271.118774ms, events_per_sec=1844.2101689350366, bytes_alloced_per_event=43 kB, total_allocs=6125665
        bench_test.go:100: batch_size=1000, total_events=30000, batch_time=558.03918ms, events_per_sec=1791.9888707455987, bytes_alloced_per_event=43 kB, total_allocs=7350324
PASS
ok      github.com/elastic/beats/winlogbeat/eventlog    68.095s

PS C:\Gopath\src\github.com\elastic\beats\winlogbeat> go test -v github.com/elastic/beats/winlogbeat/eventlog -run ^TestBenchmarkBatchReadSize$ -benchmem -benchtime 10s -benchtest
=== RUN   TestBenchmarkBatchReadSize
--- PASS: TestBenchmarkBatchReadSize (71.85s)
        bench_test.go:100: batch_size=10, total_events=30000, batch_time=5.713873ms, events_per_sec=1750.1264028794478, bytes_alloced_per_event=25 kB, total_allocs=7385820
        bench_test.go:100: batch_size=100, total_events=30000, batch_time=52.454484ms, events_per_sec=1906.4147118480853, bytes_alloced_per_event=24 kB, total_allocs=7354318
        bench_test.go:100: batch_size=500, total_events=25000, batch_time=260.56659ms, events_per_sec=1918.8952812407758, bytes_alloced_per_event=24 kB, total_allocs=6125688
        bench_test.go:100: batch_size=1000, total_events=30000, batch_time=530.468816ms, events_per_sec=1885.124949550286, bytes_alloced_per_event=24 kB, total_allocs=7350360
PASS
ok      github.com/elastic/beats/winlogbeat/eventlog    71.908s
```

* Fix for errno 1734 when calling EvtNext (#3112)

When reading a batch of large event log records the Windows function
EvtNext returns errno 1734 (0x6C6) which is RPC_S_INVALID_BOUND ("The
array bounds are invalid."). This seems to be a bug in Windows because
there is no documentation about this behavior.

This fix handles the error by resetting the event log subscription
handle (so events are not lost) and then retries the EvtNext call
with maxHandles/2.

Fixes #3076

* Fetch container stats in parallel (#3127)

Currently fetching container stats is very slow as each request takes up to 2 seconds. To improve the fetching time if lots of containers are around, this creates the rrequests in parallel. The main downside is that this opens lots of connections. This fix should only temporary until the bulk api is available: moby/moby#25361

* Fix heartbeat not accepting `mode` parameter (#3128)

* Remove fixed container names as not needed (#3122)

Add beat name to project namespace

* This makes sure different beats environment do not affect each other for example when Kafka is used
* It also allows to run the testsuites of all the beats in parallel

Introduce `stop-environment` command to stop all containers

* Add doc for decode_json_fields processor (#3110)

* Add doc for decode_json_fields processor
* Use changed param names
* Add example of decode_json_fields processor
* Fix intro language about processors

* Adding AmazonBeat to community beats (#3125)

I created a basic version of amazonbeat, which reads data from an amazon product periodically. This beat does not yet publish to elasticsearch.

* Reuse a byte buffer for holding XML (#3118)

Previously the data was read into a []byte encoded as UTF16. Then that
data was converted to []uint16 so that we can use utf16.Decode(). Then
the []rune slice was converted to a string which did another data copy.
The XML was unmarshalled from the string.

This PR changes the code to convert the UTF16 []byte directly to UTF8 and
puts the result into a reusable bytes.Buffer. The XML is then unmarshalled
directly from the data in buffer.

```
BenchmarkUTF16ToUTF8-4   	 2000000	      1044 ns/op        4 B/op      1 allocs/op
```

```
git checkout 6ba7700
PS > go test github.com/elastic/beats/winlogbeat/eventlog -run TestBenc -benchtest -benchtime 10s -v
=== RUN   TestBenchmarkBatchReadSize
--- PASS: TestBenchmarkBatchReadSize (67.89s)
        bench_test.go:100: batch_size=10, total_events=30000, batch_time=5.119626ms, events_per_sec=1953.2676801000696, bytes_alloced_per_event=44 kB, total_allocs=7385952
        bench_test.go:100: batch_size=100, total_events=30000, batch_time=51.366271ms, events_per_sec=1946.802795943665, bytes_alloced_per_event=44 kB, total_allocs=7354448
        bench_test.go:100: batch_size=500, total_events=25000, batch_time=250.974356ms, events_per_sec=1992.2354138842775, bytes_alloced_per_event=43 kB, total_allocs=6125812
        bench_test.go:100: batch_size=1000, total_events=30000, batch_time=514.796113ms, events_per_sec=1942.5166094834128, bytes_alloced_per_event=43 kB, total_allocs=7350550
PASS
ok      github.com/elastic/beats/winlogbeat/eventlog    67.950s

git checkout 833a806 (#3113)
PS > go test github.com/elastic/beats/winlogbeat/eventlog -run TestBenc -benchtest -benchtime 10s -v
=== RUN   TestBenchmarkBatchReadSize
--- PASS: TestBenchmarkBatchReadSize (65.69s)
        bench_test.go:100: batch_size=10, total_events=30000, batch_time=4.858277ms, events_per_sec=2058.3429063431336, bytes_alloced_per_event=25 kB, total_allocs=7385847
        bench_test.go:100: batch_size=100, total_events=30000, batch_time=51.612952ms, events_per_sec=1937.49816906423, bytes_alloced_per_event=24 kB, total_allocs=7354362
        bench_test.go:100: batch_size=500, total_events=25000, batch_time=241.713826ms, events_per_sec=2068.561853801445, bytes_alloced_per_event=24 kB, total_allocs=6125757
        bench_test.go:100: batch_size=1000, total_events=30000, batch_time=494.961643ms, events_per_sec=2020.3585755431961, bytes_alloced_per_event=24 kB, total_allocs=7350474
PASS
ok      github.com/elastic/beats/winlogbeat/eventlog    65.747s

This PR (#3118)
PS > go test github.com/elastic/beats/winlogbeat/eventlog -run TestBenc -benchtest -benchtime 10s -v
=== RUN   TestBenchmarkBatchReadSize
--- PASS: TestBenchmarkBatchReadSize (65.80s)
        bench_test.go:100: batch_size=10, total_events=30000, batch_time=4.925281ms, events_per_sec=2030.341009985014, bytes_alloced_per_event=14 kB, total_allocs=7295817
        bench_test.go:100: batch_size=100, total_events=30000, batch_time=48.976134ms, events_per_sec=2041.8108134055658, bytes_alloced_per_event=14 kB, total_allocs=7264329
        bench_test.go:100: batch_size=500, total_events=25000, batch_time=250.314316ms, events_per_sec=1997.4886294557757, bytes_alloced_per_event=14 kB, total_allocs=6050719
        bench_test.go:100: batch_size=1000, total_events=30000, batch_time=499.861923ms, events_per_sec=2000.5524605641945, bytes_alloced_per_event=14 kB, total_allocs=7260400
PASS
ok      github.com/elastic/beats/winlogbeat/eventlog    65.856s
```

* Fix make package for community beats (#3094)

gopkg.in needs to be copied from the vendor directory of libbeat in the vendor directory

* Auto generate modules list (#3131)

This is to ensure no modules are forgotten in the future

* Remove duplicated enabled entry from redis config (#3132)

* Remove --always-copy from virtualenv and make it a param (#3136)

In #3082 `--always-copy` was introduced. This caused issue on build on some operating systems. This PR reverts the change but makes `VIRTUALENV_PARAMS` a variable which can be passed to the Makefile. This allows anyone to set `--always-copy` if needed.

* Adjust script to generate fields of type geo_point (#3147)

* Fix for broken dashboard dependency in Cassandra Dashboard (#3146)

The Cassandra Dashboard was linking to the wrong Cassandra visualisation. Some left over with : in the names were still inside

Closes #3140

* Fix quotes (#3142)

* Fix a print statement to be python 3 compliant (#3144)

* Remove -prerelease from the repo names (#3153)

* Add mongobeat to list of community beats (#3156)

Mongobeat discovers instances in a mongo cluster and can be configured to ship multiple document types - from the commands db.stats() and db.serverStatus()

* Update to most recent latest builds (#3161)

* Merge snapshot and latest build for Logstash into 1 docker file

* Pass certificate options to import dashboards script (#3139)

* Pass certificate options to import dashboards script

-cert for client certificate
-key for client certificate key
-cacert for certificate authority

* Add -insecure flag to import_dashboards (#3163)

* Improve speed and stability of CI builds (#3162)

Loading and creating docker images takes quite a bit of time on the travis builds. Especially calls like apt-get update and install take lots of time and bandwidth and fail from time to time, as a host is not available.

Following actions were taken:

* Fake Kibana container is now based on alpine
* Redis stunnel container was also switched to alpine

* Add enabled config for prospectors (#3157)

The enabled config allows easily to enable and disable a specific prospector. This is consistent with metricbeat where each modules has an enabled config. By default enabled is set to true.

* Prototype Filebeat modules implementation (#3158)

Contains the Nginx module, including the fields.yml and several
pipelines.

* Add edits for docker module docs (#3176)

* Restructure and edit processors content (#3160)

* Cleaned up Changelog in master (#3181)

Added the 5.1.0 and 5.1.1 sections, removed duplicates.

* metricbeat: enhance kafka broker matching (#3129)

- compare broker names to hostname
- try to lookup metricbeat host machine fqdn and compare to broker name
- compare all ips of local machine with resolved broker name ips

* Filebeat MySQL module (#3171)

* Contains slowlog and errors filesets
* Test files for two mysql versions (5.5 and 5.7)
* Add support for built-in variables (e.g. `builtin.hostname`)
* Contains a sample Kibana dashboard

Part of #3159.

* Fix #3167 change ownership of files in build/ (#3168)

Add a new Makefile rule: fix-permissions

fix-permissions runs a docker container that changes the ownership
of all files from root to the user that runs the Makefile

* Updating documentation to add udplogbeat (#3190)

* Packer customize package info (#3188)

* packer: Enable overriding of vendor and license
* packer: customize URL of documentation link
* packer: location of readme.md.j2 folder can be specified with PACKER_TEMPLATES_DIR

* Filebeat syslog module (#3191)

* Basic parsing of syslog fields
* Supports multiline messages if the lines after the first one start
  with a space.
* Contains a simple Kibana dashboard

* Deprecate filters option in metrictbeat (#3173)

* Add support for multiple paths per fileset (#3195)

We generally need more than one path per OS, because the logs location
is not always the same. For example, depending on the linux distribution
and how you installed it, MySQL can have it's error logs in a number of
default "paths". The solution is to configure them all, which means that
Filebeat might try to access unexisting folders.

This also improves the python prototype to accept multiple modules and
to accept namespaced parameters. E.g.:

./filebeat.py --modules=nginx,syslog -M nginx.access.paths=...

* case insensitive hostname comparison in kafka broker matching (#3193)

- re-use common.LocalIPAddrs in partition module for resolving IPs
- add missing net.IPAddr type switch to common.LocalIPAddrs
- update matching to extract addresses early on using strings.ToLower
  => ensure case insensitive matching by lowercasing

* Adds a couchbase module for metricbeat (#3081)

* Export cpu cores (#3192)

* Fix: Request headers with split_cookies enabled (#3065)

* Add 3140 to changelog (#3207) (#3208)

(cherry picked from commit 0f4103f)
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…tic#3090)

Update kafka broker query

- Switch paritition metricset from client to broker
- on connect try to find the broker id (address must match advertised host).
- check broker is leader before querying offsets
- query offsets for all replicas
- remove 'isr' from event, and replace with boolean flag `insync_replica`
- replace `replicas` from event with per event `replica`-id
- update sarama to get offset per replica id

(cherry picked from commit 1078319)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants