在Elasticsearch中獲取每個存盤桶平均檔案數的最佳方法是什么？-有解無憂

該賞金到期in 4天。此問題的答案有資格獲得 100聲望獎勵。 Sean Letendre想引起更多人對這個問題的關注。

假設我們是帽子制造商并且有一個 Elasticsearch 索引，其中每個檔案對應于一頂帽子的銷售。銷售記錄的一部分是銷售帽子的商店的名稱。我想找出每家商店銷售的帽子數量，以及所有商店銷售的帽子的平均數量。我能夠弄清楚的最好方法是這個搜索：

GET hat_sales/_search
{
  "size": 0,
  "query": {"match_all": {}},
  "aggs": {
    "stores": {
      "terms": {
        "field": "storename",
        "size": 65536
      },
      "aggs": {
        "sales_count": {
          "cardinality": {
            "field": "_id"
          }
        }
      }
    },
    "average_sales_count": {
      "avg_bucket": {
        "buckets_path": "stores>sales_count"
      }
    }
  }
}

（旁白：我將大小設定為 65536，因為這是默認的最大桶數。）

這個查詢的問題在于sales_count聚合執行了冗余計算：每個stores桶已經有一個doc_count屬性。但是如何doc_count在存盤桶路徑中訪問它？

uj5u.com熱心網友回復：

我想這就是你所追求的

PUT hat_sales
{
  "mappings": {
    "properties": {
      "storename": {
        "type": "keyword"
      }
    }
  }
}

POST hat_sales/_bulk?refresh=true
{"index": {}}
{"storename": "foo"}
{"index": {}}
{"storename": "foo"}
{"index": {}}
{"storename": "bar"}
{"index": {}}
{"storename": "baz"}
{"index": {}}
{"storename": "baz"}
{"index": {}}
{"storename": "baz"}



GET hat_sales/_search
{
  "size": 0,
  "query": {"match_all": {}},
  "aggs": {
    "stores": {
      "terms": {
        "field": "storename",
        "size": 65536
      }
    },
    "average_sales_count": {
      "avg_bucket": {
        "buckets_path": "stores>_count"
      }
    }
  }
}

到達的路徑_count是stores>_count

結果如下：

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "stores" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "baz",
          "doc_count" : 3
        },
        {
          "key" : "foo",
          "doc_count" : 2
        },
        {
          "key" : "bar",
          "doc_count" : 1
        }
      ]
    },
    "average_sales_count" : {
      "value" : 2.0
    }
  }
}

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/330907.html

標籤：弹性搜索弹性搜索聚合

上一篇：在完整的潛在存盤桶集上執行管道聚合

下一篇：有一個鍵值對作為logstash輸出，只使用grok過濾器