Example of how to import nginx logs into elasticsearch

Example of how to import nginx logs into elasticsearch

The nginx logs are collected by filebeat and passed to logstash, and then written to elasticsearch after being processed by logstash. Filebeat is only responsible for collection work, while logstash completes log formatting, data replacement, splitting, and creation of indexes after writing logs to elasticsearch.

1. Configure nginx log format

log_format main '$remote_addr $http_x_forwarded_for [$time_local] $server_name $request ' 
            '$status $body_bytes_sent $http_referer ' 
            '"$http_user_agent" '
            '"$connection" '
            '"$http_cookie" '
            '$request_time'
            '$upstream_response_time';

2. Install and configure filebeat and enable nginx module

tar -zxvf filebeat-6.2.4-linux-x86_64.tar.gz -C /usr/local
cd /usr/local;ln -s filebeat-6.2.4-linux-x86_64 filebeat
cd /usr/local/filebeat

Enable nginx module

./filebeat modules enable nginx

View Module

./filebeat modules list

Create a configuration file

vim /usr/local/filebeat/blog_module_logstash.yml
filebeat.modules:
- module: nginx
 access:
  enabled: true
  var.paths: ["/home/weblog/blog.cnfol.com_access.log"]
 #error:
 # enabled: true
 # var.paths: ["/home/weblogerr/blog.cnfol.com_error.log"]


output.logstash:
 hosts: ["192.168.15.91:5044"]

Start filebeat

./filebeat -c blog_module_logstash.yml -e

3. Configure logstash

tar -zxvf logstash-6.2.4.tar.gz /usr/local
cd /usr/local;ln -s logstash-6.2.4 logstash
Create a pipline file for nginx log cd /usr/local/logstash

Logstash built-in template directory

vendor/bundle/jruby/2.3.0/gems/logstash-patterns-core-4.1.2/patterns

Edit grok-patterns and add a regular pattern that supports multiple IPs

FORWORD (?:%{IPV4}[,]?[ ]?)+|%{WORD}

Official grok

http://grokdebug.herokuapp.com/patterns#

Create a logstash pipline configuration file

#input {
# stdin {}
#}
# Accept data input from filebeat {
 beats {
 port => 5044
 host => "0.0.0.0"
 }
}

filter {
 # Add a debugging switch mutate{add_field => {"[@metadata][debug]"=>true}}
 grok {
 # Filter nginx log #match => { "message" => "%{NGINXACCESS_TEST2}" }
 #match => { "message" => '%{IPORHOST:clientip} # (?<http_x_forwarded_for>[^\#]*) # \[%{HTTPDATE:[@metadata][webtime]}\] # %{NOTSPACE:hostname} # %{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion} # %{NUMBER:response} # (?:%{NUMBER:bytes}|-) # (?:"(?:%{NOTSPACE:referrer}|-)"|%{NOTSPACE:referrer}|-) # (?:"(?<http_user_agent>[^#]*)") # (?:"(?:%{NUMBER:connection}|-)"|%{NUMBER:connection}|-) # (?:"(?<cookies>[^#]*)") # %{NUMBER:request_time:float} # (?:%{NUMBER:upstream_response_time:float}|-)' }
 #match => { "message" => '(?:%{IPORHOST:clientip}|-) (?:%{TWO_IP:http_x_forwarded_for}|%{IPV4:http_x_forwarded_for}|-) \[%{HTTPDATE:[@metadata][webtime]}\] (?:%{HOSTNAME:hostname}|-) %{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion} %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:"(?:%{NOTSPACE:referrer}|-)"|%{NOTSPACE:referrer}|-) %{QS:agent} (?:"(?:%{NUMBER:connection}|-)"|%{NUMBER:connection}|-) (?:"(?<cookies>[^#]*)") %{NUMBER:request_time:float} (?:%{NUMBER:upstream_response_time:float}|-)' }
    match => { "message" => '(?:%{IPORHOST:clientip}|-) %{FORWORD:http_x_forwarded_for} \[%{HTTPDATE:[@metadata][webtime]}\] (?:%{HOSTNAME:hostname}|-) %{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion} %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:"(?:%{NOTSPACE:referrer}|-)"|%{NOTSPACE:referrer}|-) %{QS:agent} (?:"(?:%{NUMBER:connection}|-)"|%{NUMBER:connection}|-) %{QS:cookie} %{NUMBER:request_time:float} (?:%{NUMBER:upstream_response_time:float}|-)' }
 }
 # Assign the default @timestamp (the time when beats collects logs) value to the new field @read_tiimestamp
 ruby { 
 #code => "event.set('@read_timestamp',event.get('@timestamp'))"
 #Change the time zone to East 8 code => "event.set('@read_timestamp',event.get('@timestamp').time.localtime + 8*60*60)"
 }
 # Format the nginx log record time # Format time 20/May/2015:21:05:56 +0000
 date {
 locale => "en"
 match => ["[@metadata][webtime]","dd/MMM/yyyy:HH:mm:ss Z"]
 }
 # Convert the bytes field from a string to a number mutate {
 convert => {"bytes" => "integer"}
 }
 # Parse the cookie field into a json
 #mutate {
 # gsub => ["cookies",'\;',',']
 #} 
 # If CDN acceleration is used, there will be multiple IP addresses for http_x_forwarded_for. The first IP address is the user's real IP address.
 if[http_x_forwarded_for] =~ ", "{
     ruby {
         code => 'event.set("http_x_forwarded_for", event.get("http_x_forwarded_for").split(",")[0])'
        }
    }
 # Parse the IP address and obtain the geographical location of the IP address geoip {
 source => "http_x_forwarded_for"
 # # Get only the latitude and longitude, country, city, and time zone fields of the IP => ["location","country_name","city_name","region_name"] 
 }
 # Parse the agent field to obtain specific information such as browser and system version useragent {
 source => "agent"
 target => "useragent"
 }
 #Specify the data to be deleted#mutate{remove_field=>["message"]}
 # Set the index name prefix according to the log name ruby ​​{
 code => 'event.set("@[metadata][index_pre]",event.get("source").split("/")[-1])'
 } 
 # Format @timestamp to 2019.04.23
 ruby {
 code => 'event.set("@[metadata][index_day]",event.get("@timestamp").time.localtime.strftime("%Y.%m.%d"))'
 }
 # Set the default index name for output mutate {
 add_field => {
  #"[@metadata][index]" => "%{@[metadata][index_pre]}_%{+YYYY.MM.dd}"
  "[@metadata][index]" => "%{@[metadata][index_pre]}_%{@[metadata][index_day]}"
 }
 }
 # Parse the cookies field into json
# mutate {
# gsub => [
# "cookies", ";", ",",
# "cookies", "=", ":"
# ]
# #split => {"cookies" => ","}
# }
# json_encode {
# source => "cookies"
# target => "cookies_json"
# }
# mutate {
# gsub => [
# "cookies_json", ',', '","',
# "cookies_json", ':', '":"'
# ]
# }
# json {
# source => "cookies_json"
# target => "cookies2"
# }
 # If there is an error in grok parsing, write the error to a separate index if "_grokparsefailure" in [tags] {
 #if "_dateparsefailure" in [tags] {
 mutate {
  replace => {
  #"[@metadata][index]" => "%{@[metadata][index_pre]}_failure_%{+YYYY.MM.dd}"
  "[@metadata][index]" => "%{@[metadata][index_pre]}_failure_%{@[metadata][index_day]}"
  }
 }
 # If there is no error, delete the message
 }else{
 mutate{remove_field=>["message"]}
 }
}

output {
 if [@metadata][debug]{
 # Output to rubydebuyg and output metadata
 stdout{codec => rubydebug{metadata => true}}
 }else{
 # Convert the output content to "."
 stdout{codec => dots} 
 # Output to the specified es
 elasticsearch
  hosts => ["192.168.15.160:9200"]
  index => "%{[@metadata][index]}"
  document_type => "doc"
 } 
 }
}

Start logstash

nohup bin/logstash -f test_pipline2.conf &

The above is the full content of this article. I hope it will be helpful for everyone’s study. I also hope that everyone will support 123WORDPRESS.COM.

You may also be interested in:
  • Detailed explanation of how to use ELK to analyze Nginx server logs
  • Detailed explanation of Nginx log cutting by date (cutting by day)
  • nginx log cutting shell script
  • Configuration example of logging in JSON format in nginx
  • Shell script analysis of nginx log access times and the most time-consuming pages (slow query)
  • How to automatically delete Nginx logs periodically
  • Nginx log processing script under Windows
  • Python parses nginx log files

<<:  JS uses canvas technology to imitate echarts bar chart

>>:  How to install MySQL under Linux (yum and source code compilation)

Recommend

Pure CSS to achieve a single div regular polygon transformation

In the previous article, we introduced how to use...

Copy fields between different tables in MySQL

Sometimes, we need to copy a whole column of data...

Detailed explanation of overflow-scrolling to solve scrolling lag problem

Preface If you use the overflow: scroll attribute...

The One-Hand Rule of WEB2.0

<br />My previous article about CSS was not ...

Use render function to encapsulate highly scalable components

need: In background management, there are often d...

Example of integrating Kafka with Nginx

background nginx-kafka-module is a plug-in for ng...

Graphic tutorial on installing Ubuntu 18.04 on VMware 15 virtual machine

In the past few years, I have been moving back an...

How to store images in MySQL

1 Introduction When designing a database, it is i...

MySQL installation tutorial under Centos7

MySQL installation tutorial, for your reference, ...

Method of iframe adaptation in web responsive layout

Problem <br />In responsive layout, we shou...

25 advanced uses of JS array reduce that you must know

Preface Reduce is one of the new conventional arr...

CSS code to distinguish ie8/ie9/ie10/ie11 chrome firefox

Website compatibility debugging is really annoyin...