目錄
- 前言
- ELK是什么(What)?
- ELK組件介紹
- ELK架構圖
- 為什么要用ELK(Why)?
- ELK的使用場景(Where)?
- 如何搭建ELK(How)?
- 實戰專案介紹
- 實戰專案分析
- 從零搭建ELK完成實戰專案
- 搭建ElasticSearch
- 搭建Logstash
- 搭建Kibana
前言
學無止境,任何形式的學習,最終必須要有輸出才有可能變成自己的知識體系和知識積累,這篇文章也是自己對ELK學習程序中的一次輸出,我按照我一貫學習新技術的常用思維(3W1H)去總結歸納,
ELK是什么(What)?
ELK 是ElasticSearch開源生態中提供的一套完整日志收集、分析以及展示的解決方案,是三個產品的首字母縮寫,分別是ElasticSearch、Logstash 和 Kibana,除此之外,FileBeat也是目前使用較多的日志收集軟體,相對于Logstash更加輕量級占用資源更少,本文學習還是以Logstash為例,
ELK組件介紹
ElasticSearch ,它是一個近實時(NRT)的分布式搜索和分析引擎,它可以用于全文搜索,結構化搜索以及分析,它是一個建立在全文搜索引擎 Apache Lucene 基礎上的搜索引擎,使用 Java 語言撰寫,
Logstash ,它是一個具有近實時(NRT)傳輸能力的資料收集、過濾、分析引擎,用來進行資料收集、決議、過濾,并最終將資料發送給ES,
Kibana ,它是一個為 ElasticSearch 提供分析和展示的可視化 Web 平臺,它可以在 ElasticSearch 的索引中查找,互動資料,并生成各種維度表格、圖形以及儀表盤,
ELK架構圖

為什么要用ELK(Why)?
隨著我們系統架構的不斷升級,由單體轉為分布式、微服務、網格系統等,用戶訪問產生的日志量也在不斷增加,我們急需一個可以快速、準確查詢和分析日志的平臺,
一個完整的日志分析平臺,需要包含以下幾個主要特點:
- 收集-能夠采集多種來源的日志資料(系統的錯誤日志+業務的訪問日志)
- 傳輸-能夠穩定的把日志資料傳輸到日志平臺
- 存盤-如何存盤日志資料
- 分析-可以支持 UI 分析
- 警告-能夠提供錯誤報告,監控機制
而ELK的出現則為我們提供了一個完整的解決方案,并且都是開源軟體,之間互相配合使用,完美銜接,高效的滿足了很多場合的應用,是目前主流的一種日志系統,傳統意義上,ELK也是作為替代Splunk(日志分析領域的領導者)的一個開源解決方案,
ELK的使用場景(Where)?
ELK的核心使用場景,肯定是用于大型軟硬體系統的日志采集、分析、展示,近年來隨著互聯網用戶的急劇增加,各種場景也被進一步挖掘出來,剛好最近幾年也是大資料比較火熱的時候,大家都在使用各種大資料的產品,我們發現 Elasticsearch 就有處理海量資料的能力,幾十百 TB 處理起來也很正常,并且比 Hadoop 更方便,速度更快,因此ELK也被用于其他場景,比如 SIEM 領域,有很多公司用來進行安全方面的資料分析,做企業防入侵檢測、例外流量分析、用戶行為分析等,
如何搭建ELK(How)?
我們以一個實戰專案為基礎,從零開始搭建,
實戰專案介紹
對業務系統日志(系統日志+用戶訪問日志)進行實時搜索、分析和展示,
實戰專案分析
- 目前業務系統日志存在oralce資料庫的log表中,
- 需要先通過Logstash采集oralce中的log表資料,
- 將Logstash采集到的資料發送到ElasticSearch中,
- 通過Kibana對ES中資料進行查詢、分析和展示,
從零搭建ELK完成實戰專案
搭建ElasticSearch
去官網下載ES,本文中以elasticsearch-6.4.3.tar.gz為例
- 解壓縮
tar -zxvf elasticsearch-6.4.3.tar.gz - 配置ES的核心組態檔
cd /usr/local/elasticsearch-6.4.3/config
vim elasticsearch.yml
配置如下:
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: zkc-elasticsearch
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-0
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /usr/local/elasticsearch-6.4.3/data
#
# Path to log files:
#
path.logs: /usr/local/elasticsearch-6.4.3/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 0.0.0.0
#
# Set a custom port for HTTP:
#
#http.port: 9200
http.cors.enabled : true
http.cors.allow-origin : "*"
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.zen.ping.unicast.hosts: ["host1", "host2"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
#discovery.zen.minimum_master_nodes:
#
#cluster.initial_master_nodes: ["node-0"]
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
- 安裝IK中文分析器,ES默認分析器無法對中文進行分析,去GitHub上下載IK分析器,本文中以elasticsearch-analysis-ik-6.4.3.zip為例
- 解壓縮IK
unzip elasticsearch-analysis-ik-6.4.3.zip -d /usr/local/elasticsearch-6.4.3/plugins/ik/ - ES不能用root用戶啟動,所以需要創建普通用戶并附權限
useradd esuser
chown -R esuser /usr/local/elasticsearch-6.4.3/ - 我在虛擬機測驗,需要配置es的JVM引數,記憶體夠的可以忽略
vim jvm.options
配置如下:
-Xms128M
-Xmx128M
- 配置es其他啟動引數
vim /etc/security/limits.conf
配置如下:
# /etc/security/limits.conf
#
#This file sets the resource limits for the users logged in via PAM.
#It does not affect resource limits of the system services.
#
#Also note that configuration files in /etc/security/limits.d directory,
#which are read in alphabetical order, override the settings in this
#file in case the domain is the same or more specific.
#That means for example that setting a limit for wildcard domain here
#can be overriden with a wildcard setting in a config file in the
#subdirectory, but a user specific setting here can be overriden only
#with a user specific setting in the subdirectory.
#
#Each line describes a limit for a user in the form:
#
#<domain> <type> <item> <value>
#
#Where:
#<domain> can be:
# - a user name
# - a group name, with @group syntax
# - the wildcard *, for default entry
# - the wildcard %, can be also used with %group syntax,
# for maxlogin limit
#
#<type> can have the two values:
# - "soft" for enforcing the soft limits
# - "hard" for enforcing hard limits
#
#<item> can be one of the following:
# - core - limits the core file size (KB)
# - data - max data size (KB)
# - fsize - maximum filesize (KB)
# - memlock - max locked-in-memory address space (KB)
# - nofile - max number of open file descriptors
# - rss - max resident set size (KB)
# - stack - max stack size (KB)
# - cpu - max CPU time (MIN)
# - nproc - max number of processes
# - as - address space limit (KB)
# - maxlogins - max number of logins for this user
# - maxsyslogins - max number of logins on the system
# - priority - the priority to run user process with
# - locks - max number of file locks the user can hold
# - sigpending - max number of pending signals
# - msgqueue - max memory used by POSIX message queues (bytes)
# - nice - max nice priority allowed to raise to values: [-20, 19]
# - rtprio - max realtime priority
#
#<domain> <type> <item> <value>
#
#* soft core 0
#* hard rss 10000
#@student hard nproc 20
#@faculty soft nproc 20
#@faculty hard nproc 50
#ftp hard nproc 0
#@student - maxlogins 4
* soft nofile 65536
* hard nofile 131072
* soft nproc 2048
* hard nproc 4096
# End of file
vim /etc/sysctl.conf
配置如下:
# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
#
vm.max_map_count=262145
配置生效
sysctl -p
8. 切換用戶并啟動ES
su esuser
cd /usr/local/elasticsearch-6.4.3/bin/
./elasticsearch
9. 啟動后查看控制臺,并訪問ES http://192.168.184.145:9200

搭建Logstash
去官網下載壓縮包,本文以logstash-6.4.3.tar.gz為例
- 解壓縮
tar -zxvf logstash-6.4.3.tar.gz
mv logstash-6.4.3 /usr/local/ - 創建同步檔案夾,后面會存入同步相關jar包和組態檔
mkdir sync - 創建并編輯同步組態檔
cd sync
vim logstash-db-sync.conf
配置如下:
input{
jdbc{
# 設定 MySql/MariaDB 資料庫url以及資料庫名稱
jdbc_connection_string => "jdbc:oracle:thin:@172.16.4.29:1521:urpdb"
# 用戶名和密碼
jdbc_user => "USR_JWJC_DEV"
jdbc_password => "JWJCDEV1234"
# 資料庫驅動所在位置,可以是絕對路徑或者相對路徑
jdbc_driver_library => "/usr/local/logstash-6.4.3/sync/ojdbc8-12.2.0.1.jar"
# 驅動類名
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
# 開啟分頁
jdbc_paging_enabled => "true"
# 分頁數量
jdbc_page_size => "1000"
# 執行的sql檔案路徑
statement_filepath => "/usr/local/logstash-6.4.3/sync/jwf_log.sql"
# 設定任務間隔 含義:分 時 天 月 年 全部*默認每分鐘跑一次
schedule => "* * * * *"
# 索引型別
type => "_doc"
# 是否開啟記錄上次追蹤的結果
use_column_value => true
# 記錄上次追蹤的結果值
last_run_metadata_path => "/usr/local/logstash-6.4.3/sync/track_time"
# 追蹤欄位名稱
tracking_column => "ID"
# 追蹤欄位型別
tracking_column_type => "numeric"
# 是否清除追蹤記錄
clean_run => false
# 資料庫欄位名稱大寫轉小寫
lowercase_column_names => false
}
}
output{
# es配置
elasticsearch{
# es地址
hosts => ["192.168.184.145:9200"]
# 索引庫名稱
index => "jwf-logs"
# 設定索引ID
document_id => "%{ID}"
}
# 日志輸出
stdout{
codec => json_lines
}
}
- 拷貝配置中涉及的資料庫驅動jar包,根據實際資料庫來

- 編輯用于同步的sql
vim jwf_log.sql
SELECT * from T_SYSTEM_REQUEST_LOG WHERE ID > :sql_last_value
- 啟動logstash并觀察es索引和資料是否正確
cd bin/
./logstash - 通過es-head觀察或者直接es的rest api查詢是否存在索引jwf-logs


搭建Kibana
去官網下載壓縮包,本案例中以kibana-6.4.3-linux-x86_64.tar.gz為例,
- 解壓縮
tar -zxvf kibana-6.4.3-linux-x86_64.tar.gz - 配置kibana組態檔
cd /usr/local/kibana-6.4.3-linux-x86_64/config/
vim kibana.yml
配置如下:
默認kibana只能連接本機的ES
# Kibana is served by a back end server. This setting specifies the port to use.
#server.port: 5601
# Specifies the address to which the Kibana server will bind. IP addresses and host names are both valid values.
# The default is 'localhost', which usually means remote machines will not be able to connect.
# To allow connections from remote users, set this parameter to a non-loopback address.
server.host: "192.168.184.145"
# Enables you to specify a path to mount Kibana at if you are running behind a proxy.
# Use the `server.rewriteBasePath` setting to tell Kibana if it should remove the basePath
# from requests it receives, and to prevent a deprecation warning at startup.
# This setting cannot end in a slash.
#server.basePath: ""
# Specifies whether Kibana should rewrite requests that are prefixed with
# `server.basePath` or require that they are rewritten by your reverse proxy.
# This setting was effectively always `false` before Kibana 6.3 and will
# default to `true` starting in Kibana 7.0.
#server.rewriteBasePath: false
# The maximum payload size in bytes for incoming server requests.
#server.maxPayloadBytes: 1048576
# The Kibana server's name. This is used for display purposes.
#server.name: "your-hostname"
# The URL of the Elasticsearch instance to use for all your queries.
elasticsearch.url: "http://192.168.184.145:9200"
# When this setting's value is true Kibana uses the hostname specified in the server.host
# setting. When the value of this setting is false, Kibana uses the hostname of the host
# that connects to this Kibana instance.
#elasticsearch.preserveHost: true
# Kibana uses an index in Elasticsearch to store saved searches, visualizations and
# dashboards. Kibana creates a new index if the index doesn't already exist.
#kibana.index: ".kibana"
# The default application to load.
#kibana.defaultAppId: "home"
# If your Elasticsearch is protected with basic authentication, these settings provide
# the username and password that the Kibana server uses to perform maintenance on the Kibana
# index at startup. Your Kibana users still need to authenticate with Elasticsearch, which
# is proxied through the Kibana server.
#elasticsearch.username: "user"
#elasticsearch.password: "pass"
# Enables SSL and paths to the PEM-format SSL certificate and SSL key files, respectively.
# These settings enable SSL for outgoing requests from the Kibana server to the browser.
#server.ssl.enabled: false
#server.ssl.certificate: /path/to/your/server.crt
#server.ssl.key: /path/to/your/server.key
# Optional settings that provide the paths to the PEM-format SSL certificate and key files.
# These files validate that your Elasticsearch backend uses the same key files.
#elasticsearch.ssl.certificate: /path/to/your/client.crt
#elasticsearch.ssl.key: /path/to/your/client.key
# Optional setting that enables you to specify a path to the PEM file for the certificate
# authority for your Elasticsearch instance.
#elasticsearch.ssl.certificateAuthorities: [ "/path/to/your/CA.pem" ]
# To disregard the validity of SSL certificates, change this setting's value to 'none'.
#elasticsearch.ssl.verificationMode: full
# Time in milliseconds to wait for Elasticsearch to respond to pings. Defaults to the value of
# the elasticsearch.requestTimeout setting.
#elasticsearch.pingTimeout: 1500
# Time in milliseconds to wait for responses from the back end or Elasticsearch. This value
# must be a positive integer.
#elasticsearch.requestTimeout: 30000
# List of Kibana client-side headers to send to Elasticsearch. To send *no* client-side
# headers, set this value to [] (an empty list).
#elasticsearch.requestHeadersWhitelist: [ authorization ]
# Header names and values that are sent to Elasticsearch. Any custom headers cannot be overwritten
# by client-side headers, regardless of the elasticsearch.requestHeadersWhitelist configuration.
#elasticsearch.customHeaders: {}
# Time in milliseconds for Elasticsearch to wait for responses from shards. Set to 0 to disable.
#elasticsearch.shardTimeout: 30000
# Time in milliseconds to wait for Elasticsearch at Kibana startup before retrying.
#elasticsearch.startupTimeout: 5000
# Logs queries sent to Elasticsearch. Requires logging.verbose set to true.
#elasticsearch.logQueries: false
# Specifies the path where Kibana creates the process ID file.
#pid.file: /var/run/kibana.pid
# Enables you specify a file where Kibana stores log output.
#logging.dest: stdout
# Set the value of this setting to true to suppress all logging output.
#logging.silent: false
# Set the value of this setting to true to suppress all logging output other than error messages.
#logging.quiet: false
# Set the value of this setting to true to log all events, including system usage information
# and all requests.
#logging.verbose: false
# Set the interval in milliseconds to sample system and process performance
# metrics. Minimum is 100ms. Defaults to 5000.
#ops.interval: 5000
# The default locale. This locale can be used in certain circumstances to substitute any missing
# translations.
#i18n.defaultLocale: "en"
- 啟動Kibana
cd bin/
./kibana - 進入Kibana主頁并配置查詢的index索引規則



- 索引規則創建后,進入Discover即可查詢到被規則命中的索引記錄,基于Lucene語法


- 支持監控圖表和儀表盤自定義

注意:
ELK中涉及的各組件,版本要保持一致,否則可能會出現匹配錯誤,
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/278046.html
標籤:其他
下一篇:六十、Oozie的使用示例
