我需要按 9 個欄位分組并獲取 ElasticSearch 中每個組的計數,原始代碼使用“腳本”并且性能很差,因此我需要對其進行優化。我設法創建了一個新欄位并使用了“copy_to”,但是當我與新欄位聚合時,我發現了一些問題。
我使用' srcIp '和' dstIp '欄位作為測驗,copy_to欄位是' aggCondition '。這是映射:
PUT /test_index
{
"settings": {
"number_of_replicas": 0,
"number_of_shards": 1
},
"mappings": {
"dynamic_templates": [
{
"set_copy_to": {
"match": "^(src|dst). ",
"match_pattern": "regex",
"mapping": {
"copy_to": "aggCondition",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
}
}
}
]
}
}
然后我添加一些資料
{
"srcIp":"192.0.0.1",
"dstIp":"192.0.1.1"
}
{
"srcIp":"192.0.1.1",
"dstIp":"192.0.2.1"
}
{
"srcIp":"192.0.2.1",
"dstIp":"192.0.0.1"
}
然后我在 kibana 中看到映射,它看起來像這樣:
{
"mappings": {
"_doc": {
"dynamic_templates": [
{
"set_copy_to": {
"match": "^(src|dst). ",
"match_pattern": "regex",
"mapping": {
"copy_to": "aggCondition",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
}
}
}
],
"properties": {
"aggCondition": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"dstIp": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"copy_to": [
"aggCondition"
]
},
"srcIp": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"copy_to": [
"aggCondition"
]
}
}
}
}
}
然后我聚合使用新欄位“aggCondition”:
GET /test_index/_search
{
"aggs": {
"Ips": {
"terms": {
"field": "aggCondition.keyword"
}
}
}
}
結果是
"aggregations" : {
"Ips" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "192.0.0.1",
"doc_count" : 2
},
{
"key" : "192.0.1.1",
"doc_count" : 2
},
{
"key" : "192.0.2.1",
"doc_count" : 2
}
]
}
}
但我期望的是
"aggregations" : {
"Ips" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "[192.0.0.1 192.0.1.1]",
"doc_count" : 1
},
{
"key" : "[192.0.1.1 192.0.2.1]",
"doc_count" : 1
},
{
"key" : "[192.0.2.1 192.0.0.1]",
"doc_count" : 1
}
]
}
}
我該怎么做才能獲得預期的結果,或者是否有其他方法可以有效地聚合多欄位?
uj5u.com熱心網友回復:
dynamic_templates并且copy_to不是您的情況的方法。您最好動態計算一個索引 src/dst IP 對的新欄位。您可以使用ingest pipelinewith anappend和join處理器來創建新欄位來實作這一點。
PUT _ingest/pipeline/ip-pipeline
{
"processors": [
{
"append": {
"field": "srcDst",
"value": ["{{{srcIp}}}", "{{{dstIp}}}"]
}
},
{
"join": {
"field": "srcDst",
"separator": "-"
}
}
]
}
然后當你索引一個新檔案時,你可以指定這個管道,新欄位將被創建:
PUT my-index/_doc/1?pipeline=ip-pipeline
{
"srcIp":"192.0.0.1",
"dstIp":"192.0.1.1"
}
您的索引檔案將如下所示:
{
"srcIp":"192.0.0.1",
"dstIp":"192.0.1.1",
"srcDst": "192.0.0.1-192.0.1.1"
}
然后您可以在該新srcDst欄位上運行聚合查詢并獲得您期望的結果。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qianduan/363505.html
上一篇:為什么過濾器不適用于彈性搜索?
