Santiago Big Data workshop
https://docs.google.com/presentation/d/18yq1IWDAUTiRm8mdd9Vydv1jWrJ7X4k7HPeYCQ38M4Q/edit?usp=sharing
- Check the elastic folder
- Untar everything
- Open a command line (cmd)
- Go to Elasticsearch
- Set
set "JAVA_HOME=C:\Program Files\Java\jdk1.8.0_161"
- Start
bin\elasticsearch
- Do the same in Kibana, go to Kibana
- Start
bin\kibana
- Check what you have in your Elasticsearch cluster
- Find the Elasticsearch link.
- Find the Kibana link and login into Kibana.
- Go to Filebeat
- Download the following log: https://github.com/gmoskovicz/bigdatasantiago/blob/master/logs.gz
- Inspect the data set.
- Modify
filebeat.yml
to read from the logs and enable it (comment outenabled: false
) - Modify
output.elasticsearch.hosts
to send to Elastic Cloud, and setup username and password - Run filebeat
- Open kibana and configure index pattern
- Go to Kibana and Devtools
- Start with the following and create the pipeline, checkoug https://github.com/elastic/elasticsearch/blob/master/libs/grok/src/main/resources/patterns/grok-patterns#L94
POST _ingest/pipeline/_simulate
{
"pipeline": {
"description": "Parsing the logs",
"processors": [
{
"grok": {
"field": "message",
"patterns": [
""
]
}
}
]
},
"docs": [
{
"_source": {
"message": """
83.149.9.216 - - [26/Aug/2014:21:13:42 +0000] "GET /presentations/logstash-monitorama-2013/images/sad-medic.png HTTP/1.1" 200 430406 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"
"""
}
}
]
}
You should end up with something like the following:
POST _ingest/pipeline/_simulate
{
"pipeline": {
"description": "Parsing the logs",
"processors": [
{
"grok": {
"field": "message",
"patterns": [
"%{COMMONAPACHELOG}"
]
}
},
{
"remove": {
"field": "message"
}
},
{
"convert": {
"field": "response",
"type": "integer"
}
},
{
"convert": {
"field": "bytes",
"type": "long"
}
},
{
"date" : {
"field" : "timestamp",
"target_field" : "@timestamp",
"formats" : ["dd/MMM/yyyy:HH:mm:ss Z"]
}
}
]
},
"docs": [
{
"_source": {
"message": """
83.149.9.216 - - [26/Aug/2014:21:13:42 +0000] "GET /presentations/logstash-monitorama-2013/images/sad-medic.png HTTP/1.1" 200 430406 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"
"""
}
}
]
}
- Add the pipeline and configure it in
filebeat.yml
PUT _ingest/pipeline/parse_logs
{
"description": "Parsing the logs",
"processors": [
{
"grok": {
"field": "message",
"patterns": [
"%{COMMONAPACHELOG}"
]
}
},
{
"remove": {
"field": "message"
}
},
{
"convert": {
"field": "response",
"type": "integer"
}
},
{
"convert": {
"field": "bytes",
"type": "long"
}
},
{
"date" : {
"field" : "timestamp",
"target_field" : "@timestamp",
"formats" : ["dd/MMM/yyyy:HH:mm:ss Z"]
}
}
]
}
Then add pipeline: parse_logs
under output.elasticsearch
in filebeat.yml
DELETE filebeat-*
in Devtools- Delete the
registry
file underdata
int he filebeat folder - Start filebeat again
- Go to Management>Index Templates and refresh the index template in Kibana
Create 4 visualizations and a dashboard