ELK收集Nginx訪問日志應用架構

Nginx的日志格式與日志變數
Nginx跟Apache一樣,都支持自定義輸出日志格式,在進行Nginx日志格式定義前,有必要先了解一下關于多層代理獲取用戶真實IP的幾個概念,
- remote_addr:表示客戶端地址,但有個條件,如果沒有使用代理,這個地址就是客戶端的真實IP,如果使用了代理,這個地址就是上層代理的IP,相當于apache日志變數%a
- X-Forwarded-For:簡稱XFF,這是一個HTTP擴展頭,格式為 X-Forwarded-For: client, proxy1, proxy2,如果一個HTTP請求到達服務器之前,經過了三個代理 Proxy1、Proxy2、Proxy3,IP 分別為 IP1、IP2、IP3,用戶真實IP為 IP0,那么按照 XFF標準,服務端最侄訓收到以下資訊:X-Forwarded-For: IP0, IP1, IP2
由此可知,IP3這個地址X-Forwarded-For并沒有獲取到,而remote_addr剛好獲取的就是IP3的地址,
還要幾個容易混淆的變數,這里也列出來做下說明:
- $remote_addr :此變數如果走代理訪問,那么將獲取上層代理的IP,如果不走代理,那么就是客戶端真實IP地址,相當于apache日志中的%a
- $http_x_forwarded_for:此變數獲取的就是X-Forwarded-For的值,
- $proxy_add_x_forwarded_for:此變數是$http_x_forwarded_for和$remote_addr兩個變數之和,
系統默認定義和參考的日志格式為main:
[root@filebeatserver nginx]# grep -A 4 'log_format' /etc/nginx/nginx.conf
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
自定義Nginx日志格式
在掌握了Nginx日志變數的含義后,接著開始對它輸出的日志格式進行改造,這里我們仍將Nginx日志輸出設定為json格式,下面僅列出Nginx組態檔nginx.conf中日志格式和日志檔案定義部分,定義好的日志格式與日志檔案如下:
map $http_x_forwarded_for $clientRealIp { # 定義日志變數clientRealIp
"" $remote_addr; # 當$http_x_forwarded_for變數為空時,將$remote_addr變數的值賦值給$clientRealIp變數
~^(?P<firstAddr>[0-9\.]+),?.*$ $firstAddr; # 當$http_x_forwarded_for變數非空時,使用正則運算式取出$http_x_forwarded_for變數中的第一個IP值并賦值給$firstAddr變數,最后$firstAddr變數的值再賦值給$clientRealIp變數
} # 所以map指令整段配置就是要獲取到真正的客戶端IP地址并將其賦值給$clientRealIp變數,$clientRealIp變數會在下面定義日志格式時參考
# 以下為自定義nginx日志格式
[root@filebeatserver ~]# vim /etc/nginx/nginx.conf
log_format nginx_log_json '{"accessip_list":"$proxy_add_x_forwarded_for","client_ip":"$clientRealIp","http_host":"$host","@timestamp":"$time_iso8601","method":"$request_method","url":"$request_uri","status":"$status","http_referer":"$http_referer","body_bytes_sent":"$body_bytes_sent","request_time":"$request_time","http_user_agent":"$http_user_agent","total_bytes_sent":"$bytes_sent","server_ip":"$server_addr"}';
access_log /var/log/nginx/access.log nginx_log_json;
驗證日志輸出
[root@filebeatserver ~]# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
[root@filebeatserver ~]# systemctl restart nginx
[root@filebeatserver ~]# ifconfig ens32 | awk 'NR==2 {print $2}'
192.168.126.90
瀏覽器訪問 http://192.168.126.90

查看nginx日志
[root@filebeatserver ~]# tailf /var/log/nginx/access.log
{"accessip_list":"192.168.126.1","client_ip":"192.168.126.1","http_host":"192.168.126.90","@timestamp":"2021-08-14T22:54:46+08:00","method":"GET","url":"/","status":"200","http_referer":"-","body_bytes_sent":"612","request_time":"0.000","http_user_agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0","total_bytes_sent":"850","server_ip":"192.168.126.90"}
{"accessip_list":"192.168.126.1","client_ip":"192.168.126.1","http_host":"192.168.126.90","@timestamp":"2021-08-14T22:54:46+08:00","method":"GET","url":"/favicon.ico","status":"404","http_referer":"-","body_bytes_sent":"153","request_time":"0.000","http_user_agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0","total_bytes_sent":"308","server_ip":"192.168.126.90"}
為nginx服務器設定一層反向代理
[root@kafkazk1 ~]# ifconfig ens32 | awk 'NR==2 {print $2}'
192.168.126.91
[root@kafkazk1 ~]# vim /etc/httpd/conf/httpd.conf
ProxyPass / http://192.168.126.90
ProxyPassReverse / http://192.168.126.90/
[root@kafkazk1 ~]# systemctl restart httpd
瀏覽器訪問 http://192.168.126.91
查看nginx日志
[root@filebeatserver ~]# tailf /var/log/nginx/access.log
{"accessip_list":"192.168.126.1, 192.168.126.91","client_ip":"192.168.126.1","http_host":"192.168.126.90","@timestamp":"2021-08-14T23:02:48+08:00","method":"GET","url":"/","status":"200","http_referer":"-","body_bytes_sent":"612","request_time":"0.000","http_user_agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0","total_bytes_sent":"850","server_ip":"192.168.126.90"}
# 可以看到此時"accessip_list"欄位是兩個IP,第一個是客戶端真實IP,第二個是代理IP;"client_ip"欄位為真實客戶端IP
在一層的基礎上設定二層反向代理
[root@kafkazk2 ~]# ifconfig ens32 | awk 'NR==2 {print $2}'
192.168.126.92
[root@kafkazk2 ~]# vim /etc/httpd/conf/httpd.conf
ProxyPass / http://192.168.126.91
ProxyPassReverse / http://192.168.126.91/
[root@kafkazk2 ~]# systemctl restart httpd
瀏覽器訪問 http://192.168.126.92

查看nginx日志
[root@filebeatserver ~]# tailf /var/log/nginx/access.log
{"accessip_list":"192.168.126.1, 192.168.126.92, 192.168.126.91","client_ip":"192.168.126.1","http_host":"192.168.126.90","@timestamp":"2021-08-14T23:08:11+08:00","method":"GET","url":"/","status":"200","http_referer":"-","body_bytes_sent":"612","request_time":"0.000","http_user_agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0","total_bytes_sent":"850","server_ip":"192.168.126.90"}
# 可以看到此時"accessip_list"欄位是三個IP,第一個是客戶端真實IP,第一個IP是第一層代理IP,第二個IP是第二層代理IP;"client_ip"欄位為真實客戶端IP
在這個輸出中,可以看到,client_ip和accessip_list輸出的異同,client_ip欄位輸出的就是真實的客戶端IP地址,而accessip_list輸出是代理疊加而成的IP串列,第一條日志,是直接訪問http://192.168.126.90不經過任何代理得到的輸出日志,第二條日志,是經過一層代理訪問http://192.168.126.91 而輸出的日志,第三條日志,是經過二層代理訪問http://192.168.126.92得到的日志輸出,
Nginx中獲取客戶端真實IP的方法很簡單,無需做特殊處理,這也給后面撰寫logstash的事件組態檔減少了很多作業量,
配置filebeat
filebeat是安裝在Nginx服務器上的,這里給出配置好的filebeat.yml檔案的內容:
[root@filebeatserver filebeat]# vim /usr/local/filebeat/filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/nginx/access.log
fields:
log_topic: nginxlogs
name: "192.168.126.90"
output.kafka:
enabled: true
hosts: ["192.168.126.91:9092", "192.168.126.92:9092", "192.168.126.93:9092"]
version: "0.10"
topic: '%{[fields][log_topic]}'
partition.round_robin:
reachable_only: true
worker: 2
required_acks: 1
compression: gzip
max_message_bytes: 10000000
logging.level: debug
# 啟動
[root@filebeatserver filebeat]# nohup /usr/local/filebeat/filebeat -e -c /usr/local/filebeat/filebeat.yml &
[1] 1056
nohup: ignoring input and appending output to ‘nohup.out’
啟動kafka+zookeeper集群
/usr/local/zookeeper/bin/zkServer.sh start
nohup /usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties &
瀏覽器訪問nginx

查看nginx訪問日志
[root@filebeatserver filebeat]# tailf /var/log/nginx/access.log
{"accessip_list":"192.168.126.1","client_ip":"192.168.126.1","http_host":"192.168.126.90","@timestamp":"2021-08-15T15:10:23+08:00","method":"GET","url":"/","status":"304","http_referer":"-","body_bytes_sent":"0","request_time":"0.000","http_user_agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0","total_bytes_sent":"180","server_ip":"192.168.126.90"}
同時驗證filebeat采集日志資料
2021-08-15T15:10:32.294+0800 DEBUG [publish] pipeline/processor.go:308 Publish event: {
"@timestamp": "2021-08-15T07:10:32.292Z",
"@metadata": {
"beat": "filebeat",
"type": "doc",
"version": "6.5.4"
},
"offset": 2007,
"message": "{\"accessip_list\":\"192.168.126.1\",\"client_ip\":\"192.168.126.1\",\"http_host\":\"192.168.126.90\",\"@timestamp\":\"2021-08-15T15:10:23+08:00\",\"method\":\"GET\",\"url\":\"/\",\"status\":\"304\",\"http_referer\":\"-\",\"body_bytes_sent\":\"0\",\"request_time\":\"0.000\",\"http_user_agent\":\"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0\",\"total_bytes_sent\":\"180\",\"server_ip\":\"192.168.126.90\"}",
"fields": {
"log_topic": "nginxlogs"
},
"prospector": {
"type": "log"
},
"input": {
"type": "log"
},
"beat": {
"name": "192.168.126.90",
"hostname": "filebeatserver",
"version": "6.5.4"
},
"host": {
"name": "192.168.126.90"
},
"source": "/var/log/nginx/access.log"
}
驗證kafka集群是否能消費到
[root@kafkazk1 ~]# /usr/local/kafka/bin/kafka-console-consumer.sh --zookeeper 192.168.126.91:2181,192.168.126.92:2181,192.168.126.93:2181 --topic nginxlogs
{"@timestamp":"2021-08-15T07:10:32.292Z","@metadata":{"beat":"filebeat","type":"doc","version":"6.5.4","topic":"nginxlogs"},"prospector":{"type":"log"},"input":{"type":"log"},"beat":{"name":"192.168.126.90","hostname":"filebeatserver","version":"6.5.4"},"host":{"name":"192.168.126.90"},"source":"/var/log/nginx/access.log","offset":2007,"message":"{\"accessip_list\":\"192.168.126.1\",\"client_ip\":\"192.168.126.1\",\"http_host\":\"192.168.126.90\",\"@timestamp\":\"2021-08-15T15:10:23+08:00\",\"method\":\"GET\",\"url\":\"/\",\"status\":\"304\",\"http_referer\":\"-\",\"body_bytes_sent\":\"0\",\"request_time\":\"0.000\",\"http_user_agent\":\"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0\",\"total_bytes_sent\":\"180\",\"server_ip\":\"192.168.126.90\"}","fields":{"log_topic":"nginxlogs"}}
均能正確收集到日志資訊
配置logstash
由于在Nginx輸出日志中已經定義好了日志格式,因此在logstash中就不需要對日志進行過濾和分析操作了,下面直接給出logstash事件組態檔kafka_nginx_into_es.conf的內容:
[root@logstashserver ~]# vim /usr/local/logstash/config/kafka_nginx_into_es.conf
input {
kafka {
bootstrap_servers => "192.168.126.91:9092,192.168.126.92:9092,192.168.126.93:9092"
topics => "nginxlogs" #指定輸入源中需要從哪個topic中讀取資料,這里會自動新建一個名為nginxlogs的topic
group_id => "logstash"
codec => json {
charset => "UTF-8"
}
add_field => { "[@metadata][myid]" => "nginxaccess-log" } #增加一個欄位,用于標識和判斷,在output輸出中會用到,
}
}
filter {
if [@metadata][myid] == "nginxaccess-log" {
mutate {
gsub => ["message","\\x","\\\x"] # 這里的message就是message欄位,也就是日志的內容,這個插件的作用是將message欄位內容中UTF-8單位元組編碼做替換處理,這是為了應對URL有中文出現的情況,
}
if ('method":"HEAD' in [message]) { # 如果message欄位中有HEAD請求,就洗掉此條資訊,
drop{}
}
json {
source => "message"
remove_field => "prospector"
remove_field => "beat"
remove_field => "source"
remove_field => "input"
remove_field => "offset"
remove_field => "fields"
remove_field => "host"
remove_field => "@version"
remove_field => "message"
}
}
}
output {
if [@metadata][myid] == "nginxaccess-log" {
elasticsearch {
hosts => ["192.168.126.95:9200","192.168.126.96:9200","192.168.126.97:9200"]
index => "logstash_nginxlogs-%{+YYYY.MM.dd}" #指定Nginx日志在elasticsearch中索引的名稱,這個名稱會在Kibana中用到,索引的名稱推薦以logstash開頭,后面跟上索引標識和時間,
}
}
}
這個logstash事件組態檔非常簡單,沒對日志格式或邏輯做任何特殊處理,由于整個組態檔跟elk收集apache日志的組態檔基本相同,所有配置完成后,就可以啟動logstash了,執行如下命令:
[root@logstashserver ~]# nohup /usr/local/logstash/bin/logstash -f /usr/local/logstash/config/kafka_nginx_into_es.conf &
[1] 1084
nohup: ignoring input and appending output to ‘nohup.out’
啟動es集群
su - elasticsearch
/usr/local/elasticsearch/bin/elasticsearch -d
訪問nginx使其產生日志,并查看es集群是否生成對應的索引(生成索引需要一定的時間)

配置Kibana
Filebeat從nginx上收集資料到kafka,然后logstash從kafka拉取資料,如果資料能夠正確發送到elasticsearch,我們就可以在Kibana中配置索引了,
[root@es2 ~]# ifconfig | awk 'NR==2 {print $2}'
192.168.126.96
# 啟動
[root@es2 ~]# nohup /usr/local/kibana/bin/kibana &
[1] 1495
nohup: ignoring input and appending output to ‘nohup.out’
瀏覽器訪問 http://192.168.126.96:5601 登錄Kibana,首先配置一個index_pattern,點擊kibana左側導航中的Management選單,然后選擇右側的Index Patterns按鈕,最后點擊左上角的Create index pattern,






轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/294171.html
標籤:其他
上一篇:docker容器一
