為什么postgresql不按聚合對我的組使用索引？-有解無憂

我在 postgresql 資料庫中有一個表，啟用了 timescaledb 擴展，如下所示：

 ------------ -------------------------- ------------- 
| Column     | Type                     | Modifiers   |
|------------ -------------------------- -------------|
| time       | timestamp with time zone |  not null   |
| value      | double precision         |  not null   |
| being      | metric_being             |  not null   |
| device     | integer                  |  not null   |
 ------------ -------------------------- -------------

以及表上的索引：

"metrics_device_time_idx" btree (device, "time" DESC)

但是當我使用 Group By 查詢表時：

explain select max(time), device from metrics group by device;

它不使用索引：

 ---------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------- 
| QUERY PLAN                                                                                                        |
|-------------------------------------------------------------------------------------------------------------------|
| Finalize GroupAggregate  (cost=104577.41..104588.61 rows=22 width=12)                                             |
|   Group Key: _hyper_9_95_chunk.device                                                                             |
|   ->  Gather Merge  (cost=104577.41..104587.95 rows=88 width=12)                                                  |
|         Workers Planned: 4                                                                                        |
|         ->  Sort  (cost=103577.35..103577.41 rows=22 width=12)                                                    |
|               Sort Key: _hyper_9_95_chunk.device                                                                  |
|               ->  Partial HashAggregate  (cost=103576.64..103576.86 rows=22 width=12)                             |
|                     Group Key: _hyper_9_95_chunk.device                                                           |
|                     ->  Parallel Append  (cost=0.00..95035.06 rows=1708317 width=12)                              |
|                           ->  Parallel Seq Scan on _hyper_9_95_chunk  (cost=0.00..44602.70 rows=1122370 width=12) |
|                           ->  Parallel Seq Scan on _hyper_9_92_chunk  (cost=0.00..24807.61 rows=756061 width=12)  |
 -------------------------------------------------------------------------------------------------------------------

最后開始有點慢。另一方面，真正快 10 倍的是

select max(time), 29 from metrics where device = 29
union
select max(time), 30 from metrics where device = 30
union
...

為什么會這樣？我可以更改我的索引或查詢以使用group by? 為什么union這么快？

uj5u.com熱心網友回復：

正如@Pavel Stehule 在他的回答中提到的，Postgres 沒有實作索引跳過掃描，這是優化這些型別查詢所必需的。Timescaledb 認識到這些型別的查詢在時間序列分析中確實很有幫助，因此他們自己實作了索引跳過掃描。從 2.2.1 版開始，它出現在他們的擴展中，請在此處查看他們的博客文章。

將擴展升級到 >= 2.2.1 后，可以重寫查詢以使用索引跳過掃描：

select distinct on (device) device, time from metrics order by device, time desc

然后使用他們的索引跳過掃描實作，在我的例子中將查詢速度提高了大約 100 倍。

uj5u.com熱心網友回復：

Postgres 不能在這種情況下使用索引。這只是現在優化器不支持。您可以找到有關此的一些資訊 - 有名為“索引跳過掃描”的補丁，但這項作業尚未完成。您可以使用一些解決方法。

轉載請註明出處，本文鏈接：https://www.uj5u.com/qiye/380150.html

標籤：sql PostgreSQL的时标数据库

上一篇：確保時間戳列（事件）在Y期間發生X次

下一篇：事實和維度：動態維度