ElasticSearch如何聚合“copy_to”欄位-有解無憂

我需要按 9 個欄位分組并獲取 ElasticSearch 中每個組的計數，原始代碼使用“腳本”并且性能很差，因此我需要對其進行優化。我設法創建了一個新欄位并使用了“copy_to”，但是當我與新欄位聚合時，我發現了一些問題。

我使用' srcIp '和' dstIp '欄位作為測驗，copy_to欄位是' aggCondition '。這是映射：

PUT /test_index
{
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 1
  },
  "mappings": {
      "dynamic_templates": [
    {
      "set_copy_to": {
        "match": "^(src|dst). ",
        "match_pattern": "regex",
        "mapping": {
          "copy_to": "aggCondition",
          "fields": {
            "keyword": {
              "ignore_above": 256,
              "type": "keyword"
            }
          },
          "type": "text"
        }
      }
    }
  ]
  }
}

然后我添加一些資料

{
  "srcIp":"192.0.0.1",
  "dstIp":"192.0.1.1"
}
{
  "srcIp":"192.0.1.1",
  "dstIp":"192.0.2.1"
}
{
  "srcIp":"192.0.2.1",
  "dstIp":"192.0.0.1"
}

然后我在 kibana 中看到映射，它看起來像這樣：

{
  "mappings": {
    "_doc": {
      "dynamic_templates": [
        {
          "set_copy_to": {
            "match": "^(src|dst). ",
            "match_pattern": "regex",
            "mapping": {
              "copy_to": "aggCondition",
              "fields": {
                "keyword": {
                  "ignore_above": 256,
                  "type": "keyword"
                }
              },
              "type": "text"
            }
          }
        }
      ],
      "properties": {
        "aggCondition": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "dstIp": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          },
          "copy_to": [
            "aggCondition"
          ]
        },
        "srcIp": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          },
          "copy_to": [
            "aggCondition"
          ]
        }
      }
    }
  }
}

然后我聚合使用新欄位“aggCondition”：

GET /test_index/_search
{
  "aggs": {
    "Ips": {
      "terms": {
        "field": "aggCondition.keyword"
      }
    }
  }
}

結果是

  "aggregations" : {
    "Ips" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "192.0.0.1",
          "doc_count" : 2
        },
        {
          "key" : "192.0.1.1",
          "doc_count" : 2
        },
        {
          "key" : "192.0.2.1",
          "doc_count" : 2
        }
      ]
    }
  }

但我期望的是

  "aggregations" : {
    "Ips" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "[192.0.0.1 192.0.1.1]",
          "doc_count" : 1
        },
        {
          "key" : "[192.0.1.1 192.0.2.1]",
          "doc_count" : 1
        },
        {
          "key" : "[192.0.2.1 192.0.0.1]",
          "doc_count" : 1
        }
      ]
    }
  }

我該怎么做才能獲得預期的結果，或者是否有其他方法可以有效地聚合多欄位？

uj5u.com熱心網友回復：

dynamic_templates并且copy_to不是您的情況的方法。您最好動態計算一個索引 src/dst IP 對的新欄位。您可以使用ingest pipelinewith anappend和join處理器來創建新欄位來實作這一點。

PUT _ingest/pipeline/ip-pipeline
{
  "processors": [
    {
      "append": {
        "field": "srcDst",
        "value": ["{{{srcIp}}}", "{{{dstIp}}}"]
      }
    },
    {
      "join": {
        "field": "srcDst",
        "separator": "-"
      }
    }
  ]
}

然后當你索引一個新檔案時，你可以指定這個管道，新欄位將被創建：

PUT my-index/_doc/1?pipeline=ip-pipeline
{
  "srcIp":"192.0.0.1",
  "dstIp":"192.0.1.1"
}

您的索引檔案將如下所示：

{
  "srcIp":"192.0.0.1",
  "dstIp":"192.0.1.1",
  "srcDst": "192.0.0.1-192.0.1.1"
}

然后您可以在該新srcDst欄位上運行聚合查詢并獲得您期望的結果。

轉載請註明出處，本文鏈接：https://www.uj5u.com/qianduan/363505.html

標籤：爪哇弹性搜索

上一篇：為什么過濾器不適用于彈性搜索？

下一篇：如何在apache鉆上查詢elasticsearch