Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add queue settings to docs #4884

Merged
merged 6 commits into from
Oct 9, 2017
Merged

Add queue settings to docs #4884

merged 6 commits into from
Oct 9, 2017

Conversation

urso
Copy link

@urso urso commented Aug 11, 2017

No description provided.

@urso urso added docs needs_backport PR is waiting to be backported to other branches. review labels Aug 11, 2017
Copy link
Contributor

@dedemorton dedemorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested a few edits and some structural changes. Hope you can follow what I'm suggesting.


[float]
[[configuration-internal-queue]]
=== Configure the internal queue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@urso I think my answer in slack must have been confusing. Probably my fault because I was not fully caffeinated.

If this topic is something that users are likely to look for in the TOC, then I would bump it up a level and make it a separate topic. That means putting the content in a separate asciidoc file, removing the float tag, making the heading a level 2 (==), and using asciidoc includes statements to include the file in each of the configuring-howto.asciidoc files. For example: https://github.com/elastic/beats/blob/master/packetbeat/docs/configuring-howto.asciidoc.

On the other hand, if users are unlikely to want to change these settings, it's probably OK to bury them under general settings, but in that case, you could change the heading to "Internal queue configuration options" to make it parallel with the other headings at this level.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will move it to the TOC

[[configuration-internal-queue]]
=== Configure the internal queue

The `queue` namespace configures the queue type to be used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change this to say,

You can configure the type and behavior of the internal queue by setting options in the `queue` section of the +{beatname_lc}.yml+ config file.

Also see my comment about moving this to appear after the description of what the queue does.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

The `queue` namespace configures the queue type to be used.

The Elastic Beats employ an internal queue for events to be published. The
queue is responsible for buffering and combining events into batches, which can
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to: "...into batches that can be consumed by the outputs" [we use "that" for restrictive clauses, "which" for non-restrictive]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I'm a little confused here. Isn't "which" use for restrictive clauses?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's the other way around. You use "that" for restrictive clauses. By restrictive, we mean a clause that gives information essential to the meaning of the sentence. In this case, the clause "that can be consumed by the outputs" is restrictive. Sometimes it's a judgment call. TBH, the only people who really care about this distinction are writers and editors. :-) :-)


The `queue` namespace configures the queue type to be used.

The Elastic Beats employ an internal queue for events to be published. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd actually put this paragraph before the sentence on line 111.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

events: 4096
------------------------------------------------------------------------------

==== Configure the Memory Qeueue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misspelling. This should say "Configure the memory queue". However, I don't think you need a separate section here (at least, not until we add support for more queue types--but even then, I'm not sure your need a separate section). It sounds like the memory queue is the only one that is supported right now? If that's true, we should state that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support for a disk based queue with very different settings will be added in the future. Right now we only have the mem queue, but I'd keep it a separate section for now.


The memory queue keeps all events in memory. By default no flush interval is
configured. All events published to this queue will be directly consumed by the
outputs. An output will consume up to the outputs `bulk_max_size` events at once.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is a bit hard to parse: "An output will consume up to the outputs bulk_max_size events at once."

How about this? "An output will consume events in batches based on the output's bulk_max_size setting."

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using up to, as it can be anything between 1 and bulk_max_size. By setting bulk_max_size: -1, it can be anything between 1 and queue.mem.events. I struggled quite a bit trying to express this in a concise manner. But "based on the outputs bulk_max_size" would have me expect it is exactly this number of events (which is not guaranteed by this queue at all).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like when we can be concise, but not if we're obfuscating meaning. In this case, I think the meaning is not clear and the grammatical structure is questionable because you are leaving words out. What you're really saying here is that "An output will consume up to the number of events specified by the output's bulk_max_size setting." You're using shorthand that's common for developers to use, but will probably seem off to some readers. They won't make the same substitution that you make in your head.

Anyhow, for now, at least make "output" possessive so that you have "the output's bulk_max_size.

configured. All events published to this queue will be directly consumed by the
outputs. An output will consume up to the outputs `bulk_max_size` events at once.

Only after an event is dropped or acknowledged by the output, a new slot will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the output is reading in batches, what do we mean by a slot? Does the slot open up before the batch is fully read? Maybe I'm not fully understanding the behavior here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, slot is quite confusing here :(

By slot I mean space is freed for accepting a new event in the queue, if and only if the output ACKs an old published events. The queue.mem.events setting dictates the total number of events buffered by the queue and outputs.

With outputs potentially operating on batches of sizes 1 to max queue size, it is the amount of events ACKed by the output, which can be read/accepted by the queue after the ACK.

Only after an event is dropped or acknowledged by the output, a new slot will
be available in the queue.

By setting `flush.min_events` and `flush.timeout`, spooling in the queue is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be more direct and say, "To enforce spooling in the queue, set the flush.min_events and flush.timeout options.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


The `queue` namespace configures the queue type to be used.

The Elastic Beats employ an internal queue for events to be published. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crap weasel. Looks like some of my comments didn't get saved when I submitted the review. I'd change this to say:

{beatname_uc} uses an internal queue to store events before publishing them. 

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@dedemorton dedemorton mentioned this pull request Aug 12, 2017
42 tasks
published to this queue will be directly consumed by the outputs. An output
will consume up to the outputs `bulk_max_size` events at once.
published to this queue will be directly consumed by the outputs.
The output's `bulk_max_size` setting limits the number of events being processed at once.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this works!

[[configuring-internal-queue]]
== Configure the internal queue

The {beatname_uc} uses an internal queue to store events before publishing them. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: remove "The".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

queue.mem:
events: 4096
------------------------------------------------------------------------------

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a [float] tag here to keep the info about the memory queue from becoming a separate topic. If you don't use the [float] tag, you get this:

image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Change did require me to add [float] about everywhere for the doc build not to fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, I really dislike [float] tags. We use them all over the library. They seem like such a hack, but are impossible to avoid without re-architecting all the content...and even so, you would still end up having to use them in some places.

Copy link
Contributor

@dedemorton dedemorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a couple of minor changes.

- remove The
- add [float]
@urso urso mentioned this pull request Aug 20, 2017
22 tasks
@ruflin
Copy link
Contributor

ruflin commented Oct 3, 2017

@dedemorton Could you have a look again at this one to see if we can get it in?

published to this queue will be directly consumed by the outputs.
The output's `bulk_max_size` setting limits the number of events being processed at once.

The memory queue is waiting for the output to acknowledge or drop an events. If
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence has a typo "an events". The verb tense also seems a little off. Should probably say:

The memory queue waits for the output to acknowledge or drop events.

@dedemorton
Copy link
Contributor

One minor comment. Otherwise, LGTM. Sorry about the delay. Didn't realize you were waiting for me to confirm your changes.

@ruflin
Copy link
Contributor

ruflin commented Oct 4, 2017

@dedemorton Thanks for the review again. I fixed the sentence above and will squash merge it as soon as the build goes green.

@ruflin
Copy link
Contributor

ruflin commented Oct 4, 2017

@urso This needs a rebase on master because of some changes in jenkins I made :-( Could you rebase it and squash the commits?

@ruflin ruflin merged commit 54a4481 into elastic:master Oct 9, 2017
ruflin pushed a commit to ruflin/beats that referenced this pull request Oct 9, 2017
* Move queue setting to TOC + address some review
* Remove flush_interval and spooler from outputconfig
* remove 'slot'
* add [float] tags
* Update queueconfig.asciidoc

(cherry picked from commit 54a4481)
@ruflin ruflin removed the needs_backport PR is waiting to be backported to other branches. label Oct 9, 2017
exekias pushed a commit that referenced this pull request Oct 10, 2017
* Move queue setting to TOC + address some review
* Remove flush_interval and spooler from outputconfig
* remove 'slot'
* add [float] tags
* Update queueconfig.asciidoc

(cherry picked from commit 54a4481)
@urso urso deleted the doc/queue-settings branch February 19, 2019 18:55
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
* Move queue setting to TOC + address some review
* Remove flush_interval and spooler from outputconfig
* remove 'slot'
* add [float] tags
* Update queueconfig.asciidoc

(cherry picked from commit d8e7864)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants