[doc] Logstash Kubernetes - Persistent Storage Docs (#14714)

Co-authored-by: Rob Bavey <[email protected]> Co-authored-by: Karen Metts <[email protected]>
elastic · Nov 4, 2022 · 90aae6a · 90aae6a
1 parent 6847ad8
commit 90aae6a
Show file tree

Hide file tree

Showing 2 changed files with 265 additions and 7 deletions.
diff --git a/docsk8s/setting-up/ls-k8s-persistent-storage.asciidoc b/docsk8s/setting-up/ls-k8s-persistent-storage.asciidoc
@@ -1,16 +1,245 @@
 [[ls-k8s-persistent-storage]]
-=== Persistent storage requirements
+=== Stateful {ls} for persistent storage
 
 WARNING: This documentation is still in development and may be changed or removed in a future release.
 
-Some snazzy intro text...
+You need {ls} to persist data to disk for certain use cases. 
+{ls} offers some persistent storage options to help:
 
-[[persistent-storage-pq-dlq]]
-==== PQ and DLQ for {ls} core
+* <<persistent-storage-pq,Persistent queue (PQ)>> to absorb bursts of events
+* <<persistent-storage-dlq,Dead letter queue (DLQ)>> to accept corrupted events that cannot be processed
+* <<persistent-storage-plugins,Persistent storage options in some {ls} plugins>>
 
-There a few factors to help you decide whether to configure your {ls} core instance to use either a persistent queue {PQ} or a dead letter queue (DLQ).
+For all of these cases, we need to ensure that we can preserve state.
+Remember that the {k8s} scheduler can shutdown pods at anytime and spawn the process to another node. To preserve state, we define our {ls} deployment using `StatefulSet` rather than `Deployment`.
+
+[[persistent-storage-statefulset]]
+==== Set up StatefulSet
+
+[source,yaml]
+--
+apiVersion: apps/v1
+kind: StatefulSet
+metadata:
+  name: logstash
+  labels:
+    app: logstash-demo
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: logstash-demo
+  serviceName: logstash
+  template:
+    metadata:
+      labels:
+        app: logstash-demo
+    spec:
+      containers:
+        - name: logstash
+          image: "docker.elastic.co/logstash/logstash:{version}"
+          env:
+            - name: LS_JAVA_OPTS
+              value: "-Xmx1g -Xms1g"
+          resources:
+            limits:
+              cpu: 2000m
+              memory: 2Gi
+            requests:
+              cpu: 1000m
+              memory: 2Gi
+          ports:
+            - containerPort: 9600
+              name: stats
+          livenessProbe:
+            httpGet:
+              path: /
+              port: 9600
+            initialDelaySeconds: 60
+            periodSeconds: 10
+            timeoutSeconds: 5
+            failureThreshold: 3
+          readinessProbe:
+            httpGet:
+              path: /
+              port: 9600
+            initialDelaySeconds: 60
+            periodSeconds: 10
+            timeoutSeconds: 5
+            failureThreshold: 3
+          volumeMounts:
+            - name: logstash-data <2>
+              mountPath: /usr/share/logstash/data
+            - name: logstash-pipeline
+              mountPath: /usr/share/logstash/pipeline
+            - name: logstash-config
+              mountPath: /usr/share/logstash/config/logstash.yml
+              subPath: logstash.yml
+            - name: logstash-config
+              mountPath: /usr/share/logstash/config/pipelines.yml
+              subPath: pipelines.yml
+      volumes:
+        - name: logstash-pipeline
+          configMap:
+            name: logstash-pipeline
+        - name: logstash-config
+          configMap:
+            name: logstash-config
+  volumeClaimTemplates: <1>
+    - metadata:
+        name: logstash-data
+        labels:
+          app: logstash-demo
+      spec:
+        accessModes: ["ReadWriteOnce"]
+        resources:
+          requests:
+            storage: 2Gi
+--
+
+Everything is similar to `Deployment`, except the usage of `VolumeClaimTemplates`.
+
+<1> Request 2G of persistent storage from `PersistentVolumes`.
+
+<2> Mount the storage to `/usr/share/logstash/data`. This is the default path of {ls} and its plugins for any persistence needs.
+
+NOTE: The feature of persistent link:https://kubernetes.io/blog/2018/07/12/resizing-persistent-volumes-using-kubernetes/[volume expansion] depends on the storage class. Check with your cloud provider.
+
+[[persistent-storage-pq]]
+==== Persistent queue (PQ)
+You can configure persistent queues globally across all pipelines in `logstash.yml`, with settings for individual pipelines in `pipelines.yml`. Note that individual settings in `pipelines.yml` override those in `logstash.yml`. Queue data store is set to `/usr/share/logstash/data/queue` by default.
+
+To enable {logstash-ref}/persistent-queues.html[PQ] for every pipeline, specify options in `logstash.yml`. 
+
+[source,yaml]
+--
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: logstash-config
+data:
+  logstash.yml: |
+    api.http.host: "0.0.0.0"
+    queue.type: persisted
+    queue.max_bytes: 1024mb
+...
+--
+
+To specify options per pipeline, set in `pipelines.yml`.
+
+[source,yaml]
+--
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: logstash-config
+data:
+  logstash.yml: |
+    api.http.host: "0.0.0.0"
+  pipelines.yml: |
+    - pipeline.id: fast_ingestion
+      path.config: "/usr/share/logstash/pipeline/fast.conf"
+      queue.type: persisted
+      queue.max_bytes: 1024mb
+    - pipeline.id: slow_ingestion
+      path.config: "/usr/share/logstash/pipeline/slow.conf"
+      queue.type: persisted
+      queue.max_bytes: 2048mb
+--
+
+[[persistent-storage-dlq]]
+==== Dead letter queue (DLQ)
+
+To enable {logstash-ref}/dead-letter-queues.html[dead letter queue], specify options in `logstash.yml`. The default path of DLQ is `/usr/share/logstash/data/dead_letter_queue`.
+
+[source,yaml]
+--
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: logstash-config
+data:
+  logstash.yml: |
+    api.http.host: "0.0.0.0"
+    dead_letter_queue.enable: true <1>
+  pipelines.yml: |
+    - pipeline.id: main <2>
+      path.config: "/usr/share/logstash/pipeline/main.conf"
+    - pipeline.id: dlq <3>
+      path.config: "/usr/share/logstash/pipeline/dlq.conf"
+--
+
+<1> Enable DLQ for all pipelines that use {logstash-ref}/plugins-outputs-elasticsearch.html[elasticsearch output plugin]
+
+<2> The `main` pipeline sends failed events to DLQ. Checkout the pipeline definition in the next section.
+
+<3> The `dlq` pipeline should consume events from the DLQ, fix errors and re-send events to {es}. Checkout the pipeline definition in the next section.
+
+[source,yaml]
+--
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: logstash-pipeline
+data:
+  main.conf: | <1>
+    input {
+      exec {
+        command => "uptime"
+        interval => 5
+      }
+    }
+    output {
+      elasticsearch { 
+        hosts => ["https://hostname.cloud.es.io:9200"]
+        index => "uptime-%{+YYYY.MM.dd}"
+        user => 'elastic'
+        password => 'changeme'
+      }
+    }
+  dlq.conf: | <2>
+    input {
+      dead_letter_queue {
+        path => "/usr/share/logstash/data/dead_letter_queue"
+        commit_offsets => true
+        pipeline_id => "main"
+      }
+    }
+    filter {
+        # Do your fix here
+    }
+    output {
+      elasticsearch { 
+        hosts => ["https://hostname.cloud.es.io:9200"]
+        index => "dlq-%{+YYYY.MM.dd}"
+        user => 'elastic'
+        password => 'changeme'
+      }
+    }
+--
+
+<1> An example pipeline that tries to send events to a closed index in {es}. To test this functionality manually, use {ref}/indices-close.html[_close] API to close the index.
+
+<2> This pipeline use {logstash-ref}/plugins-inputs-dead_letter_queue.html[dead_letter_queue input plugin] to consume DLQ events. This example sends to a different index, but you can add filter plugins to fix other types of error causing fail insertion, such as mapping errors.
 
 [[persistent-storage-plugins]]
 ==== Plugins that require local storage to track work done
+Many Logstash plugins are stateful, and need to use persistent storage to track the current state of the work that they are doing. 
+
+Logstash plugins that are stateful will typically have some kind of `path` that needs to be configured, such as `sincedb_path` or `last_run_metadata_path`
+
+Here is the list of popular plugins that will require persistent storage, and the use of a `StatefulSet` with `VolumeClaimTemplates`, checkout <<persistent-storage-statefulset>>.
 
-Description...
+[cols="<,<",options="header",]
+|=======================================================================
+|Plugin |Settings
+|logstash-codec-netflow| {logstash-ref}/plugins-codecs-netflow.html#plugins-codecs-netflow-cache_save_path[cache_save_path]
+|logstash-inputs-couchdb_changes| {logstash-ref}/plugins-inputs-couchdb_changes.html#plugins-inputs-couchdb_changes-sequence_path[sequence_path]
+|logstash-input-dead_letter_queue| {logstash-ref}/plugins-inputs-dead_letter_queue.html#plugins-inputs-dead_letter_queue-sincedb_path[sincedb_path]
+|logstash-input-file| {logstash-ref}/plugins-inputs-file.html#plugins-inputs-file-file_completed_log_path[file_completed_log_path], {logstash-ref}/plugins-inputs-file.html#plugins-inputs-file-sincedb_path[sincedb_path]
+|logstash-input-google_cloud_storage| {logstash-ref}/plugins-inputs-google_cloud_storage.html#plugins-inputs-google_cloud_storage-processed_db_path[processed_db_path]
+|logstash-input-imap| {logstash-ref}/plugins-inputs-imap.html#plugins-inputs-imap-sincedb_path[sincedb_path]
+|logstash-input-jdbc| {logstash-ref}/plugins-inputs-jdbc.html#plugins-inputs-jdbc-last_run_metadata_path[last_run_metadata_path]
+|logstash-input-s3| {logstash-ref}/plugins-inputs-s3.html#plugins-inputs-s3-sincedb_path[sincedb_path]
+|logstash-filters-aggregate| {logstash-ref}/plugins-filters-aggregate.html#plugins-filters-aggregate-aggregate_maps_path[aggregate_maps_path]
+|=======================================================================
diff --git a/docsk8s/troubleshooting/ls-k8s-troubleshooting-methods.asciidoc b/docsk8s/troubleshooting/ls-k8s-troubleshooting-methods.asciidoc
@@ -7,6 +7,8 @@ Here are some approaches that you can use to diagnose the state of your {ls} and
 * <<ls-k8s-viewing-logs>>
 * <<ls-k8s-connecting-to-a-container>>
 * <<ls-k8s-diagnostics>>
+* <<ls-k8s-pq-util>>
+* <<ls-k8s-pq-drain>>
 
 [float]
 [[ls-k8s-checking-resources]]
@@ -92,4 +94,31 @@ jdk/bin/jcmd 1 GC.heap_dump /tmp/heap_dump.hprof
 [source,bash]
 --
 kubectl cp logstash-7477d46bb7-4lcnv:/tmp/heap_dump.hprof ./heap.hprof
---
+--
+
+[[ls-k8s-pq-util]]
+=== Running PQ utilities
+
+In the event of persistent queue corruption, the `pqcheck` and `pqrepair` tools are available for troubleshooting.
+
+Run {logstash-ref}/persistent-queues.html#pqcheck[pqcheck] to identify corrupted files: 
+
+[source,bash]
+--
+kubectl exec logstash-0 -it -- /usr/share/logstash/bin/pqcheck /usr/share/logstash/data/queue/pipeline_id
+--
+
+Run {logstash-ref}/persistent-queues.html#pqrepair[pqrepair] to repair the queue: 
+
+[source,bash]
+--
+kubectl exec logstash-0 -it -- /usr/share/logstash/bin/pqrepair /usr/share/logstash/data/queue/pipeline_id
+--
+
+[[ls-k8s-pq-drain]]
+=== Draining the PQ
+
+{ls} provides a `queue.drain: true` configuration setting to pause shutdown, when Logstash is stopped gracefully, until all messages from a persistent queue have been handled. 
+Special consideration needs to be taken when using the `queue.drain: true` setting when using {k8s}. By default, a {k8s} pod has a grace period of 30 seconds to shutdown before it is closed forcefully, via a `SIGKILL`, which may cause {ls} to exit before the queue is fully drained.
+
+To avoid {ls} shutting down before the queue is completely drained, we recommend setting the `TerminationGracePeriodSeconds` value to an artifically long period, such as 1 year, to give {ls} sufficient time to drain the queue when this functionality is required.