一、簡介
什么是ELK?ELK是Elasticsearch、Logstash、Kibana這三個軟體的首字母縮寫;其中elasticsearch是用來做資料的存盤和搜索的搜索引擎;logstash是資料收集處理平臺,它能夠對特定的資料做分析、切詞、收集、過濾等等處理,通常用于對日志的處理;kibana是用于把處理后的資料做可視化展示,提供一個web界面,方便我們去elasticsearch中檢索想要的資料;elasticsearch是一個高度可擴展的開源全文搜索和分析引擎,它可實作資料的實時全文搜索,支持分布式實作高可用,提供RUSTfull風格的API介面,可以處理大規模日志資料;
elasticsearch是基于java語言在lucene的框架上進行開發實作;lucene是java中的一個成熟免費的開源搜索類別庫,本質上lucene只是提供編程API介面,要想使用lucene框架做搜索引擎,需要用戶自行開發lucene的外殼,實作呼叫lucene的API介面實作全文檢索和搜尋;elasticsearch就是以lucene為資訊檢索庫的搜索引擎;
elasticsearch的基本組件
索引(index):檔案容器,具有類似屬性的檔案的集合,類似關系型資料庫中的表的概念;在elasticsearch中索引名稱必須使用小寫字母;
型別(type):型別是索引內部的邏輯磁區,其意義完全取決于用戶需求,一個索引內部可定義一個或多個型別,一搬來說,型別就是擁有相同的域的檔案的預定義;
檔案(document):檔案是lucene索引和搜索的原子單位,它包含了一個或多個域,是域的容器,基于JSON格式表示,一個域由一個名字,一個或多個值組成;擁有多個值得域,通常我們稱為多值域;
映射(mapping):原始內容存盤為檔案之前需要事先進行分析,例如切詞、過濾掉某些詞等;映射用于定義此分析機制該如何實作;除此之外,ES(elasticsearch)還為映射提供了諸如將域中的內容排序等功能,
elasticsearch集群組件
cluster:ES的集群標識為集群名稱;默認為"elasticsearch",節點就是靠此名字來決定加入到哪個集群中,一個節點只能屬于于一個集群,
Node:運行了單個ES實體的主機即為節點,用于存盤資料、參與集群索引及搜索操作,節點的標識靠節點名,
Shard:將索引切割成為的物理存盤組件;但每一個shard都是一個獨立且完整的索引;創建索引時,ES默認將其分割為5個shard,用戶也可以按需自定義,創建完成之后不可修改,shard有兩種型別primary shard和replica,Replica用于資料冗余及查詢時的負載均衡,每個主shard的副本數量可自定義,且可動態修改,
ES Cluster作業程序
啟動時,通過多播(默認)或單播方式在9300/tcp查找同一集群中的其它節點,并與之建立通信,集群中的所有節點會選舉出一個主節點負責管理整個集群狀態,以及在集群范圍內決定各shards的分布方式,站在用戶角度而言,每個node均可接收并回應用戶的各類請求,
集群有狀態:green, red, yellow;green表示集群狀態健康,各節點上的shard和我們定義的一樣;yellow表示集群狀態亞健康,可能存在shard和我們定義的不一致,比如某個節點宕機了,它上面的shard也隨著消失,此時集群的狀態就是亞健康狀態;一般yellow狀態是很容易轉變為green狀態的;red表示集群狀態不健康,比如3個節點有2個節點都宕機了,那么也就意味著這兩個節點上的shard丟失,當然shard丟失,對應的資料也會隨之丟失;所以red狀態表示集群有丟失資料的風險;
二、elasticsearch集群部署
環境說明
某個服務如果以分布式或集群的模式作業,首先我們要把各節點的時間進行同步,這是集群的基本原則;其次,一個集群的名稱決議不能也不應該依賴外部的dns服務來決議,因為一旦dns服務掛掉,它會影響整個集群的通信,所以如果需要用到名稱決議,我們應該首先考慮hosts檔案來決議各節點名稱;如果集群各節點間需要互相拷貝資料,我們應該還要做ssh 互信;以上三個條件是大多數集群的最基本條件;
| 名稱 | ip地址 | 埠 |
| es1 | 192.168.0.41 | 9200/9300 |
| es2 | 192.168.0.42 | 9200/9300 |
各節點安裝jdk
yum install -y java-1.8.0-openjdk-devel
提示:不同的es版本對jdk的版本要求也不一樣,這個可以去官方檔案中看,對應es版本需要用到的jdk版本;
匯出JAVA_HOME

驗證java版本和JAVA_HOME環境變數

下載elasticsearch rpm包
[root@node01 ~]# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.12.rpm --2020-10-01 20:44:29-- https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.12.rpm Resolving artifacts.elastic.co (artifacts.elastic.co)... 151.101.110.222, 2a04:4e42:36::734 Connecting to artifacts.elastic.co (artifacts.elastic.co)|151.101.110.222|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 148681336 (142M) [application/octet-stream] Saving to: ‘elasticsearch-6.8.12.rpm’ 100%[==========================================================================>] 148,681,336 133MB/s in 1.1s 2020-10-01 20:45:07 (133 MB/s) - ‘elasticsearch-6.8.12.rpm’ saved [148681336/148681336]
安裝elasticsearch rpm包
[root@node01 ~]# ll total 145200 -rw-r--r-- 1 root root 148681336 Aug 18 19:38 elasticsearch-6.8.12.rpm [root@node01 ~]# yum install ./elasticsearch-6.8.12.rpm Loaded plugins: fastestmirror Examining ./elasticsearch-6.8.12.rpm: elasticsearch-6.8.12-1.noarch Marking ./elasticsearch-6.8.12.rpm to be installed Resolving Dependencies --> Running transaction check ---> Package elasticsearch.noarch 0:6.8.12-1 will be installed --> Finished Dependency Resolution Dependencies Resolved =================================================================================================================================== Package Arch Version Repository Size =================================================================================================================================== Installing: elasticsearch noarch 6.8.12-1 /elasticsearch-6.8.12 229 M Transaction Summary =================================================================================================================================== Install 1 Package Total size: 229 M Installed size: 229 M Is this ok [y/d/N]: y Downloading packages: Running transaction check Running transaction test Transaction test succeeded Running transaction Creating elasticsearch group... OK Creating elasticsearch user... OK Installing : elasticsearch-6.8.12-1.noarch 1/1 ### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using systemd sudo systemctl daemon-reload sudo systemctl enable elasticsearch.service ### You can start elasticsearch service by executing sudo systemctl start elasticsearch.service Created elasticsearch keystore in /etc/elasticsearch Verifying : elasticsearch-6.8.12-1.noarch 1/1 Installed: elasticsearch.noarch 0:6.8.12-1 Complete! [root@node01 ~]#
編輯組態檔

提示:es的主組態檔是/etc/elasticsearch/elasticsearch.yml;其中我們需要配置cluster.name,node.name,path.data,path.log,這四項是非常重要的,cluster.name是配置的集群名稱,同一集群各主機就是依賴這個配置判斷是否是同一集群,所以在同一集群的其他節點的配置,這個名稱必須一致;node.name是用于標識節點名稱,這個名稱在集群中是唯一的,也就說這個名稱在同一集群的其他節點必須唯一,不能重復;path.data用于指定es存放資料的目錄,建議各節點都配置同一個目錄方便管理;其次這個目錄還建議掛載一個存盤;path.logs用于指定es的日志存放目錄;

提示:bootstrap.memory_lock: true這項配置表示啟動es時,立即分配jvm.options這個檔案中定義的記憶體大小;默認沒有啟用,如果要啟用,我們需要主機節點記憶體是否夠用,以及elasticsearch用戶是否能夠申請對應大小的記憶體;network.host用于指定es監聽的ip地址,0.0.0.0表示監聽本機所有可用地址;http.port用于指定對用戶提供服務的埠地址;discovery.zen.ping.unicast.hosts指定對那些主機做單播通信來發現節點;discovery.zen.minimum_master_nodes指定master節點的的最小數量;不指定默認就是1;
完整的配置
[root@node01 ~]# cat /etc/elasticsearch/elasticsearch.yml # ======================== Elasticsearch Configuration ========================= # # NOTE: Elasticsearch comes with reasonable defaults for most settings. # Before you set out to tweak and tune the configuration, make sure you # understand what are you trying to accomplish and the consequences. # # The primary way of configuring a node is via this file. This template lists # the most important settings you may want to configure for a production cluster. # # Please consult the documentation for further information on configuration options: # https://www.elastic.co/guide/en/elasticsearch/reference/index.html # # ---------------------------------- Cluster ----------------------------------- # # Use a descriptive name for your cluster: # cluster.name: test-els-cluster # # ------------------------------------ Node ------------------------------------ # # Use a descriptive name for the node: # node.name: node01 # # Add custom attributes to the node: # #node.attr.rack: r1 # # ----------------------------------- Paths ------------------------------------ # # Path to directory where to store the data (separate multiple locations by comma): # path.data: /els/data # # Path to log files: # path.logs: /els/logs # # ----------------------------------- Memory ----------------------------------- # # Lock the memory on startup: # #bootstrap.memory_lock: true # # Make sure that the heap size is set to about half the memory available # on the system and that the owner of the process is allowed to use this # limit. # # Elasticsearch performs poorly when the system is swapping the memory. # # ---------------------------------- Network ----------------------------------- # # Set the bind address to a specific IP (IPv4 or IPv6): # network.host: 0.0.0.0 # # Set a custom port for HTTP: # http.port: 9200 # # For more information, consult the network module documentation. # # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when new node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] # discovery.zen.ping.unicast.hosts: ["node01", "node02"] # # Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1): # discovery.zen.minimum_master_nodes: 1 # # For more information, consult the zen discovery module documentation. # # ---------------------------------- Gateway ----------------------------------- # # Block initial recovery after a full cluster restart until N nodes are started: # #gateway.recover_after_nodes: 3 # # For more information, consult the gateway module documentation. # # ---------------------------------- Various ----------------------------------- # # Require explicit names when deleting indices: # #action.destructive_requires_name: true [root@node01 ~]#View Code
創建資料目錄和日志目錄,并把對應目錄修改成elasticsearch屬主和屬組

復制組態檔到其他節點對應位置,并修改node.name為對應節點名稱,并在對應節點上創建資料目錄和日志目錄并把其屬主和屬組修改成elasticsearch

提示:對于node02上的es配置和node01上的配置,唯一不同的就是節點名稱,其余都是一樣的;
啟動node01、node02上的es,并把es設定為開機啟動

提示:可以看到node01和node02上的9200和9300都處于監聽狀態了;9200是用戶對外提供服務的埠,9300是用于集群各節點通信埠;到此2節點的es集群就搭建好了;
驗證:訪問node01和node02的9200埠,看看回應內容cluster_name和cluster_uuid是否是一樣?

提示:可以看到訪問node01和node02的9200埠,回應內容都回應了相同cluster_name和cluster_uuid;說明node01和node02屬于同一個集群;
查看es介面提供的cat介面
[root@node01 ~]# curl http://node02:9200/_cat
=^.^=
/_cat/allocation
/_cat/shards
/_cat/shards/{index}
/_cat/master
/_cat/nodes
/_cat/tasks
/_cat/indices
/_cat/indices/{index}
/_cat/segments
/_cat/segments/{index}
/_cat/count
/_cat/count/{index}
/_cat/recovery
/_cat/recovery/{index}
/_cat/health
/_cat/pending_tasks
/_cat/aliases
/_cat/aliases/{alias}
/_cat/thread_pool
/_cat/thread_pool/{thread_pools}
/_cat/plugins
/_cat/fielddata
/_cat/fielddata/{fields}
/_cat/nodeattrs
/_cat/repositories
/_cat/snapshots/{repository}
/_cat/templates
[root@node01 ~]#
查看集群node資訊
[root@node01 ~]# curl http://node02:9200/_cat/nodes 192.168.0.42 19 96 1 0.00 0.05 0.05 mdi - node02 192.168.0.41 15 96 1 0.03 0.04 0.05 mdi * node01 [root@node01 ~]#
提示:后面帶*號的表示master節點;
查看集群健康狀態
[root@node01 ~]# curl http://node02:9200/_cat/health 1601559464 13:37:44 test-els-cluster green 2 2 0 0 0 0 0 0 - 100.0% [root@node01 ~]#
查看集群索引資訊
[root@node01 ~]# curl http://node02:9200/_cat/indices [root@node01 ~]#
提示:這里顯示空,是因為集群里沒有任何資料;
查看集群分片資訊
[root@node01 ~]# curl http://node02:9200/_cat/shards [root@node01 ~]#
獲取myindex索引下的test型別的1號檔案資訊
[root@node01 ~]# curl http://node02:9200/myindex/test/1
{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_expression","resource.id":"myindex","index_uuid":"_na_","index":"myindex"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_expression","resource.id":"myindex","index_uuid":"_na_","index":"myindex"},"status":404}[root@node01 ~]#
[root@node01 ~]# curl http://node02:9200/myindex/test/1?pretty
{
"error" : {
"root_cause" : [
{
"type" : "index_not_found_exception",
"reason" : "no such index",
"resource.type" : "index_expression",
"resource.id" : "myindex",
"index_uuid" : "_na_",
"index" : "myindex"
}
],
"type" : "index_not_found_exception",
"reason" : "no such index",
"resource.type" : "index_expression",
"resource.id" : "myindex",
"index_uuid" : "_na_",
"index" : "myindex"
},
"status" : 404
}
[root@node01 ~]#
提示:?pretty表示用易讀的JSON格式輸出;從上面的反饋內容,它告訴我們沒有找到指定的索引;
添加一個檔案到es集群的指定索引
[root@node01 ~]# curl -XPUT http://node01:9200/myindex/test/1 -d '
{"name":"zhangsan","age":18,"gender":"nan"}'
{"error":"Content-Type header [application/x-www-form-urlencoded] is not supported","status":406}[root@node01 ~]#
提示:這里向es寫指定檔案到指定索引下,回傳不支持header頭部;解決辦法,手動指定頭部型別;
[root@node01 ~]# curl -XPUT http://node01:9200/myindex/test/1 -H 'content-Type:application/json' -d '
{"name":"zhangsan","age":18,"gender":"nan"}'
{"_index":"myindex","_type":"test","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":1}[root@node01 ~]#
驗證:查看myindex索引下的test型別的1號檔案,看看是否能夠查到我們剛才寫的資料?
[root@node01 ~]# curl http://node01:9200/myindex/test/1?pretty
{
"_index" : "myindex",
"_type" : "test",
"_id" : "1",
"_version" : 1,
"_seq_no" : 0,
"_primary_term" : 1,
"found" : true,
"_source" : {
"name" : "zhangsan",
"age" : 18,
"gender" : "nan"
}
}
[root@node01 ~]#
提示:可以看到回傳了我們剛才寫的檔案內容;
現在再次查看集群的索引資訊和分片資訊

提示:可以看到現在es集群中有一個myindex的索引,其狀態為green;分片資訊中也可以看到有5各主分片和5個replica分片;并且每個分片都的master和replica都沒有在同一個節點;
搜索所有的索引和型別

提示:jq是用于以美觀方式顯示json資料,作用同pretty的一樣;以上命令表示從所有型別所用索引中搜索,name欄位為zhangsan的資訊;如果命中了,就會把對應檔案列印出來;未命中就告訴我們未命中;如下
[root@node01 ~]# curl http://node01:9200/_search?q=age:19|jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 135 100 135 0 0 2906 0 --:--:-- --:--:-- --:--:-- 2934
{
"took": 37,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
[root@node01 ~]# curl http://node01:9200/_search?q=age:18|jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 247 100 247 0 0 10795 0 --:--:-- --:--:-- --:--:-- 11227
{
"took": 12,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "myindex",
"_type": "test",
"_id": "1",
"_score": 1,
"_source": {
"name": "zhangsan",
"age": 18,
"gender": "nan"
}
}
]
}
}
[root@node01 ~]#
提示:如果要在指定索引中搜索在前面的url加上指定的索引名稱即可;

提示:如果有多個索引我們也可以根據多個索引名稱的特點來使用*來匹配;如下
[root@node01 ~]# curl http://node01:9200/*/_search?q=age:18|jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 247 100 247 0 0 8253 0 --:--:-- --:--:-- --:--:-- 8517
{
"took": 20,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "myindex",
"_type": "test",
"_id": "1",
"_score": 1,
"_source": {
"name": "zhangsan",
"age": 18,
"gender": "nan"
}
}
]
}
}
[root@node01 ~]# curl http://node01:9200/my*/_search?q=age:18|jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 247 100 247 0 0 7843 0 --:--:-- --:--:-- --:--:-- 7967
{
"took": 19,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "myindex",
"_type": "test",
"_id": "1",
"_score": 1,
"_source": {
"name": "zhangsan",
"age": 18,
"gender": "nan"
}
}
]
}
}
[root@node01 ~]#
搜索指定的單個索引的指定型別

提示:以上就是在es集群的命令列介面常用操作,通常我們用es集群,不會在命令列中做搜索,我們會利用web界面來做;命令列只是用于測驗;好了到此es集群就搭建好了;后續我們就可以用logstash收集指定地方的資料,傳給es,然后再利用kibana的web界面來展示es中的資料;
轉載請註明出處,本文鏈接:https://www.uj5u.com/caozuo/146619.html
標籤:其他
上一篇:小白安裝centos遇到的難題
下一篇:Apache虛擬主機網頁部署目錄
