1.使用kibana對索引庫操作

①創建索引庫

PUT /lepeng

②查看索引庫

GET /lepeng

③洗掉索引庫

DELETE /lepeng

2.使用kibana對型別及映射操作

有了索引庫，等于有了資料庫中的database，接下來就需要創建資料庫中的表，創建資料庫表需要設定欄位約束，索引庫也一樣，在創建索引庫的型別時，需要知道這個型別下有哪些欄位，每個欄位有哪些約束資訊，這就叫做欄位映射(mapping)

①Elasticsearch支持的資料型別：

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html

String型別，又分兩種：
- text：可分詞，不可參與聚合
- keyword：不可分詞，資料會作為完整欄位進行匹配，可以參與聚合
Numerical：數值型別，分兩類
- 基本資料型別：long、interger、short、byte、double、float、half_float
- 浮點數的高精度型別：scaled_float
  - 需要指定一個精度因子，比如10或100，elasticsearch會把真實值乘以這個因子后存盤，取出時再還原，
Date：日期型別

elasticsearch可以對日期格式化為字串存盤，但是建議我們存盤為毫秒值，存盤為long，節省空間，
Array：陣列型別
- 進行匹配時，任意一個元素滿足，都認為滿足
- 排序時，如果升序則用陣列中的最小值來排序，如果降序則用陣列中的最大值來排序
Object：物件

{
    name:"Jack",
    age:21,    
   	girl:{
		name: "Rose",
        age:21
   }
}

如果存盤到索引庫的是物件型別，例如上面的girl，會把girl編程兩個欄位：girl.name和girl.age

②創建欄位映射

index的默認值就是true，也就是說你不進行任何配置，所有欄位都會被索引，
但是有些欄位是我們不希望被索引的，比如商品的圖片資訊，就需要手動設定index為false，

#ik_max_word 將文本做最細粒度的拆分
#ik_smart 會做最粗粒度的拆分
PUT /lepeng/_mapping/
{
  "properties": {
    "title": {
      "type": "text",
      "analyzer": "ik_max_word"
    },
    "images": {
      "type": "keyword",
      "index": "false"
    },
    "price": {
      "type": "float"
    }
  }
}

③一次創建索引庫和型別

settings 就是索引庫設定，其中可以定義索引庫的各種屬性，可以不設定，都走默認，

put /lepengA
{
    "settings":{
        "索引庫屬性名":"索引庫屬性值"
    },
    "mappings":{
            "properties":{
                "欄位名":{
                    "映射屬性名":"映射屬性值"
                }
            }
    }
}

例如：

PUT /lepeng1
{
  "settings": {}, 
  "mappings": {
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "ik_max_word"
        }
      }
  }
}

可以查看lepeng1索引庫：
在這里插入圖片描述

3.使用kibana對檔案操作

對比于資料庫，就是添加表中資料

①新增檔案

POST /索引庫名/_doc
{
    "key":"value"
}

POST /lepeng/_doc
{
  "title": "xiaomi",
  "images": "images/1.jpg",
  "price": 265.00
}

新增檔案的時候會對這個資料生成一個隨機的id
在這里插入圖片描述
當然也可以自己指定一個id添加

POST /lepeng/_doc/1
{
  "title": "redmi",
  "images": "images/2.jpg",
  "price": 670.00,
  "stock": 20
}

在這里插入圖片描述

②查看指定檔案

GET /heima/_doc/id值

GET /lepeng/_doc/1

在這里插入圖片描述

③更改檔案

POST /lepeng/_doc/1
{
  "title": "redmi",
  "images": "images/2.jpg",
  "price": 770.00
}

如果指定id不存在，就是添加，如果指定id存在就是更改

id不存在情況：

在這里插入圖片描述

id存在時

在這里插入圖片描述

④洗掉檔案

DELETE /索引庫名/_doc/id值

在這里插入圖片描述

4.智能判斷

①新增檔案添加索引庫未被配置欄位

在這里插入圖片描述
可見創建成功，然后看一下映射欄位

可以發現issealed被智能判斷為Boolean型別，但是仔context是String型別資料，ES無法智能判斷，它就會存入兩種映射型別，例如：

context：text型別
context.keyword：keyword型別

出現這種情況的原因是，智能映射底層是根據一個指定的模板規則映射的，映射規則如下：

JSON 型別	Elasticsearch 型別
`null`	不添加
`true` or `false`	`boolean`
floating point number	`float`
integer	`long`
string	`text` , 附帶一個 `keyword` 子域

這種智能映射，底層原理是動態模板映射，如果我們想修改這種智能映射的規則，其實只要修改動態模板即可！

②修改智能映射模板的語法

"dynamic_templates": [
    {
      "my_template_name": { 
        ...  match conditions ... 
        "mapping": { ... } 
      }
    },
    ...
  ]

說明：

my_template_name：自定義模板名稱
match conditions：匹配條件，凡是符合條件的未定義欄位，都會按照這個規則來映射
mapping：映射規則，匹配成功后的映射規則

示例：

PUT /lepeng2
{
  "mappings": {
      "properties": {
            "title": {
                "type": "text"
            },
            "images": {
                "type": "keyword",
                "index": false
            },
            "price": {
                "type": "float"
            }
      },
    "dynamic_templates": [
      {
        "my_strings": {
          "match_mapping_type": "string", 
          "mapping": {
            "type": "keyword"
          }
        }
      }
    ]
  }
}

然后再存入資料：

POST /lepeng2/_doc/1
{
  "title": "redmi",
  "images": "images/2.jpg",
  "price": 670.00,
  "stock": 20,
  "issealed": false,
  "context": "縱向絲滑"
}

在這里插入圖片描述
可以看到context被映射成了keyword，而非之前的text和keyword并存，說明我們的動態模板生效了！

5.基本查詢

準備資料：

# 創建產品索引庫，然后對title進行ik分詞
PUT /product
{
  "mappings": {
      "properties": {
            "title": {
                "type": "text",
                "analyzer": "ik_max_word"
            },
            "images": {
                "type": "keyword",
                "index": "false"
            },
            "price": {
                "type": "float"
            }
      }
  }
}

# 插入資料
POST product/_doc/1
{
  "title": "小米手機",
  "images": "http://image.leyou.com/12479122.jpg",
  "price": 2999
}

POST product/_doc/2
{
  "title": "華為手機",
  "images": "http://image.leyou.com/12479122.jpg",
  "price": 3999
}


POST product/_doc/3
{
  "title": "蘋果手機",
  "images": "http://image.leyou.com/12479122.jpg",
  "price": 4999
}

POST product/_doc/4
{
  "title": "小米筆記本",
  "images": "http://image.leyou.com/12479122.jpg",
  "price": 5999
}

POST product/_doc/5
{
  "title": "聯想筆記本",
  "images": "http://image.leyou.com/12479122.jpg",
  "price": 9000
}

POST product/_doc/6
{
  "title": "apple",
  "images": "http://image.leyou.com/12479122.jpg",
  "price": 9000
}

①查詢所有

GET /product/_search
{
  "query": {
    "match_all": {
      
    }
  }
}

②匹配查詢

GET product/_search
{
  "query": {
    "match": {
      "title": "小米手機"
    }
  }
}

因為title采用了text型別，查詢時會對搜索關鍵詞進行分詞，分為小米和手機，然后使用兩個詞分別做檢索，最后將結果取并集
在這里插入圖片描述
某些情況下，我們需要取分詞檢索結果的交集，此時使用"operator":"and"選項實作

GET product/_search
{
  "query": {
    "match": {
      "title": {
        "query": "小米手機", "operator": "and"
      }
    }
  }
}

在這里插入圖片描述

③詞條匹配

term 查詢被用于精確值匹配，這些精確值可能是數字、時間、布爾或者那些未分詞的字串

GET product/_search
{
  "query": {
    "term": {
      "price": {
        "value": "2999"
      }
    }
  }
}

④范圍查詢

range 查詢找出那些落在指定區間內的數字或者時間
range查詢允許以下字符：

運算子	說明
gt	大于
gte	大于等于
lt	小于
lte	小于等于

GET product/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 2999,
        "lt": 4999
      }
    }
  }
}

在這里插入圖片描述

⑤模糊查詢

fuzzy 查詢是 term 查詢的模糊等價，它允許用戶搜索詞條與實際詞條的拼寫出現偏差
fuzziness表示偏差距離，如果為0，就成了詞條匹配

GET product/_search
{
  "query": {
    "fuzzy": {
      "title": {
        "value": "華為手打",
        "fuzziness": 2
      }
    }
  }
}

在這里插入圖片描述

⑥布爾組合(bool)

ool把各種其它查詢通過must（與）、must_not（非）、should`（或）的方式進行組合

GET product/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": "小米"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "title": "筆記本"
          }
        }
      ]
    }
  }
}

在這里插入圖片描述

⑦結果過濾

直接指定回傳欄位

指定要回傳的欄位，過濾掉非指定欄位

GET product/_search
{
  "_source": {
    "includes": ["title","price"]
  },
  "query": {
    "match_all": {}
  }
}

指定包含和排除

通過includes來指定想要顯示的欄位，通過excludes來指定不想要顯示的欄位，二者可選一個使用

GET product/_search
{
  "_source": {
    "excludes": "images"
  },
  "query": {
    "match_all": {}
  }
}

⑧排序

單欄位排序

sort 可以讓我們按照不同的欄位進行排序，并且通過order指定排序的方式

GET product/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}

在這里插入圖片描述

多欄位排序

假定我們想要結合使用 price和 _score（得分）進行查詢，并且匹配的結果首先按照相關性得分排序，然后按照價格排序

GET product/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "_score": {
        "order": "desc"
      }
    },
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}

⑨分頁

分頁

elasticsearch的分頁與mysql資料庫非常相似，都是指定兩個值

from：開始位置
size：每頁大小

GET product/_search
{
  "query": {
    "match_all": {}
  },
  "from": 0,
  "size": 2
}

在這里插入圖片描述

⑩高亮

在使用match查詢的同時，加上一個highlight屬性：

pre_tags：前置標簽
post_tags：后置標簽
fields：需要高亮的欄位
title：這里宣告title欄位需要高亮，后面可以為這個欄位設定特有配置，也可以空

GET product/_search
{
  "query": {
    "match": {
      "title": "小米手機"
    }
  },
  "highlight": {
    "pre_tags": "<font color='red'>",
    "post_tags": "</font>",
    "fields": {
      "title": {}
    }
  }
}

在這里插入圖片描述

6.分組聚合查詢

基本概念

在我們的mysql有這么兩類函式：

分組函式: group by
聚合函式: sum、avg、max、min

使用它們可以輕松實作對資料的統計分析，其實在ES中也存在類似的用法，只不過名字略有差異，稱為桶和度量

桶（bucket）

桶的作用，是按照某種方式對資料進行分組，每一組資料在ES中稱為一個桶，ES中提供的劃分桶的方式有很多：

Date Histogram Aggregation：根據日期階梯分組，例如給定階梯為周，會自動每周分為一組
Histogram Aggregation：根據數值階梯分組，與日期類似，需要知道分組的間隔（interval）
Terms Aggregation：根據詞條內容分組，詞條內容完全匹配的為一組
Range Aggregation：數值和日期的范圍分組，指定開始和結束，然后按段分組
……

度量（metrics）

分組完成以后，我們一般會對組中的資料進行聚合運算，例如求平均值、最大、最小、求和等，這些在ES中稱為度量

比較常用的一些度量聚合方式：

Avg Aggregation：求平均值
Max Aggregation：求最大值
Min Aggregation：求最小值
Percentiles Aggregation：求百分比
Stats Aggregation：同時回傳avg、max、min、sum、count等
Sum Aggregation：求和
Top hits Aggregation：求前幾
Value Count Aggregation：求總數
……
測驗資料

# 在ES中，需要進行聚合、排序、過濾的欄位其處理方式比較特殊，因此不能被分詞，必須使用keyword或數值型別，
PUT /car
{
  "mappings": {
      "properties": {
        "color": {
          "type": "keyword"
        },
        "make": {
          "type": "keyword"
        }
      }
    }
}
# 匯入資料
POST /car/_bulk
{ "index": {}}
{ "price" : 10000, "color" : "紅", "make" : "本田", "sold" : "2014-10-28" }
{ "index": {}}
{ "price" : 20000, "color" : "紅", "make" : "本田", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 30000, "color" : "綠", "make" : "福特", "sold" : "2014-05-18" }
{ "index": {}}
{ "price" : 15000, "color" : "藍", "make" : "豐田", "sold" : "2014-07-02" }
{ "index": {}}
{ "price" : 12000, "color" : "綠", "make" : "豐田", "sold" : "2014-08-19" }
{ "index": {}}
{ "price" : 20000, "color" : "紅", "make" : "本田", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 80000, "color" : "紅", "make" : "寶馬", "sold" : "2014-01-01" }
{ "index": {}}
{ "price" : 25000, "color" : "藍", "make" : "福特", "sold" : "2014-02-12" }

聚合為桶

汽車的顏色color來劃分桶，按照顏色分桶，最好是使用TermAggregation型別，按照顏色的名稱來分桶

GET car/_search
{
  "size": 0, 
  "aggs": {
    "aggs_color": {
      "terms": {
        "field": "color"
      }
    }
  }
}

aggs：宣告這是一個聚合查詢，是aggregations的縮寫
aggs_color：給這次聚合起一個名字，可任意指定，
terms：聚合的型別，這里選擇terms，是根據詞條內容（這里是顏色）劃分
field：劃分桶時依賴的欄位

桶內度量

每種顏色汽車的平均價格是多少？
我們需要告訴ES使用哪個欄位，使用何種度量方式進行運算，這些資訊要嵌套在桶內，度量的運算會基于桶內的檔案進行

GET car/_search
{
  "size": 0,
  "aggs": {
    "aggs_color": {
      "terms": {
        "field": "color",
        "size": 10
      },
      "aggs": {
        "price_avg": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

在這里插入圖片描述

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/293897.html

標籤：其他

上一篇：現在公司都不缺人了？軟體測驗作業經歷3年，居然被坑了？防不勝防！

下一篇：Flink 內核原理與實作-入門

Elasticsearch基本操作

1.使用kibana對索引庫操作

①創建索引庫

②查看索引庫

③洗掉索引庫

2.使用kibana對型別及映射操作

①Elasticsearch支持的資料型別：

②創建欄位映射

③一次創建索引庫和型別

3.使用kibana對檔案操作

①新增檔案

②查看指定檔案

③更改檔案

id不存在情況：

id存在時

④洗掉檔案

4.智能判斷

①新增檔案 添加索引庫未被配置欄位

②修改智能映射模板的語法

5.基本查詢

①查詢所有

②匹配查詢

③詞條匹配

④范圍查詢

⑤模糊查詢

⑥布爾組合(bool)

⑦結果過濾

直接指定回傳欄位

指定包含和排除

⑧排序

單欄位排序

多欄位排序

⑨分頁

分頁

⑩高亮

6.分組聚合查詢

基本概念

聚合為桶

桶內度量

①新增檔案添加索引庫未被配置欄位