Big Data Week Santiago

Santiago Big Data workshop

First lets talk about Architecture

https://docs.google.com/presentation/d/18yq1IWDAUTiRm8mdd9Vydv1jWrJ7X4k7HPeYCQ38M4Q/edit?usp=sharing

The hands on

[1] - Create an Elasticsearch instance in windows

Check the elastic folder
Untar everything
Open a command line (cmd)
Go to Elasticsearch
Set set "JAVA_HOME=C:\Program Files\Java\jdk1.8.0_161"
Start bin\elasticsearch
Do the same in Kibana, go to Kibana
Start bin\kibana

[2] - Inspect Kibana

Check what you have in your Elasticsearch cluster
Find the Elasticsearch link.
Find the Kibana link and login into Kibana.

[3] - Load the data logs

Go to Filebeat
Download the following log: https://github.com/gmoskovicz/bigdatasantiago/blob/master/logs.gz
Inspect the data set.
Modify filebeat.yml to read from the logs and enable it (comment out enabled: false)
Modify output.elasticsearch.hosts to send to Elastic Cloud, and setup username and password
Run filebeat
Open kibana and configure index pattern

[4] Creating the pipeline to parse the logs

Go to Kibana and Devtools
Start with the following and create the pipeline, checkoug https://github.com/elastic/elasticsearch/blob/master/libs/grok/src/main/resources/patterns/grok-patterns#L94

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "Parsing the logs",
    "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": [
          ""
        ]
      }
    }
  ]
  },
  "docs": [
    {
      "_source": {
      "message": """     
        83.149.9.216 - - [26/Aug/2014:21:13:42 +0000] "GET /presentations/logstash-monitorama-2013/images/sad-medic.png HTTP/1.1" 200 430406 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"
        """
      }
    }
  ]
}

You should end up with something like the following:

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "Parsing the logs",
    "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": [
          "%{COMMONAPACHELOG}"
        ]
      }
    },
    {
      "remove": {
        "field": "message"
      }
    },
    {
      "convert": {
        "field": "response",
        "type": "integer"
      }
    },
    {
      "convert": {
        "field": "bytes",
        "type": "long"
      }
    },    
    {
      "date" : {
        "field" : "timestamp",
        "target_field" : "@timestamp",
        "formats" : ["dd/MMM/yyyy:HH:mm:ss Z"]
      }
    }
  ]
  },
  "docs": [
    {
      "_source": {
        "message": """     
        83.149.9.216 - - [26/Aug/2014:21:13:42 +0000] "GET /presentations/logstash-monitorama-2013/images/sad-medic.png HTTP/1.1" 200 430406 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"
        """
      }
    }
  ]
}

[5] Add the pipeline and configure it

Add the pipeline and configure it in filebeat.yml

PUT _ingest/pipeline/parse_logs
{
  "description": "Parsing the logs",
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": [
          "%{COMMONAPACHELOG}"
        ]
      }
    },
    {
      "remove": {
        "field": "message"
      }
    },
    {
      "convert": {
        "field": "response",
        "type": "integer"
      }
    },
    {
      "convert": {
        "field": "bytes",
        "type": "long"
      }
    },    
    {
      "date" : {
        "field" : "timestamp",
        "target_field" : "@timestamp",
        "formats" : ["dd/MMM/yyyy:HH:mm:ss Z"]
      }
    }
  ]
}

Then add pipeline: parse_logs under output.elasticsearch in filebeat.yml

[6] Delete data and start from scratch

DELETE filebeat-* in Devtools
Delete the registry file under data int he filebeat folder
Start filebeat again
Go to Management>Index Templates and refresh the index template in Kibana

[7] Create dashboard to visualize the logs

Create 4 visualizations and a dashboard

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Architecture.png		Architecture.png
Diagram.png		Diagram.png
README.md		README.md
Screen Shot 2018-10-16 at 1.43.38 PM.png		Screen Shot 2018-10-16 at 1.43.38 PM.png
Screen Shot 2018-12-12 at 10.41.30 AM.png		Screen Shot 2018-12-12 at 10.41.30 AM.png
Screen Shot 2021-03-08 at 09.30.05.png		Screen Shot 2021-03-08 at 09.30.05.png
dashboard.png		dashboard.png
logs.gz		logs.gz
twitter.png		twitter.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Big Data Week Santiago

First lets talk about Architecture

The hands on

[1] - Create an Elasticsearch instance in windows

[2] - Inspect Kibana

[3] - Load the data logs

[4] Creating the pipeline to parse the logs

[5] Add the pipeline and configure it

[6] Delete data and start from scratch

[7] Create dashboard to visualize the logs

About

Releases

Packages

gmoskovicz/bigdatasantiago

Folders and files

Latest commit

History

Repository files navigation

Big Data Week Santiago

First lets talk about Architecture

The hands on

[1] - Create an Elasticsearch instance in windows

[2] - Inspect Kibana

[3] - Load the data logs

[4] Creating the pipeline to parse the logs

[5] Add the pipeline and configure it

[6] Delete data and start from scratch

[7] Create dashboard to visualize the logs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages