我們在嘗試為 Postgres 托管的資料集市構建查詢時遇到問題。我們的查詢很簡單,并且包含適量的資料。我們已經看到兩次運行之間此查詢的執行時間存在巨大差異——有時需要大約 20 秒,有時只需要 3 秒——但我們似乎無法看到導致這些差異的原因,我們的目標是獲得一致的結果。查詢中只涉及 2 個表,一個表示訂單行(OrderItemTransactionFact 2,937,264 行),另一個記錄每個專案的當前庫存水平(stocklevels 62,353 行)。沒有外鍵,因為這是一個我們運行 ETL 程序的資料集市,因此需要快速加載。
查詢是;
select
oitf."SKUId",
sum(oitf."ConvertedLineTotal") as "totalrevenue",
sum(oitf."Quantity") as "quantitysold",
coalesce (sl."Available",0) as "availablestock"
from "OrderItemTransactionFact" oitf
left join stocklevels sl on sl."SKUId" = oitf."SKUId"
where
oitf."transactionTypeId" = 2
and oitf."hasComposite" = false
and oitf."ReceivedDate" >= extract(epoch from timestamp '2020-07-01 00:00:00')
and oitf."ReceivedDate" <= extract(epoch from timestamp '2021-10-01 00:00:00')
group by
oitf."SKUId", sl."Available"
order by oitf."SKUId";
OrderItemTransactionFact 表有幾個索引;
create index IX_OrderItemTransactionFact_ReceivedDate on public."OrderItemTransactionFact" ("ReceivedDate" DESC);
create index IX_OrderItemTransactionFact_ReceivedDate_transactionTypeId on public."OrderItemTransactionFact" ("ReceivedDate" desc, "transactionTypeId");
26 秒運行的執行計劃輸出是;
GroupAggregate (cost=175096.24..195424.66 rows=813137 width=52) (actual time=24100.268..24874.065 rows=26591 loops=1)
Group Key: oitf."SKUId", sl."Available"
Buffers: shared hit=659 read=43311 written=1042
-> Sort (cost=175096.24..177129.08 rows=813137 width=19) (actual time=24100.249..24275.594 rows=916772 loops=1)
Sort Key: oitf."SKUId", sl."Available"
Sort Method: quicksort Memory: 95471kB
Buffers: shared hit=659 read=43311 written=1042
-> Hash Left Join (cost=20671.85..95274.08 rows=813137 width=19) (actual time=239.392..23127.993 rows=916772 loops=1)
Hash Cond: (oitf."SKUId" = sl."SKUId")
Buffers: shared hit=659 read=43311 written=1042
-> Bitmap Heap Scan on "OrderItemTransactionFact" oitf (cost=18091.90..73485.91 rows=738457 width=15) (actual time=200.178..22413.601 rows=701397 loops=1)
Recheck Cond: (("ReceivedDate" >= '1585699200'::double precision) AND ("ReceivedDate" <= '1625097600'::double precision))
Filter: ((NOT "hasComposite") AND ("transactionTypeId" = 2))
Rows Removed by Filter: 166349
Heap Blocks: exact=40419
Buffers: shared hit=55 read=42738 written=1023
-> Bitmap Index Scan on ix_orderitemtransactionfact_receiveddate (cost=0.00..17907.29 rows=853486 width=0) (actual time=191.274..191.274 rows=867746 loops=1)
Index Cond: (("ReceivedDate" >= '1585699200'::double precision) AND ("ReceivedDate" <= '1625097600'::double precision))
Buffers: shared hit=9 read=2365 written=181
-> Hash (cost=1800.53..1800.53 rows=62353 width=8) (actual time=38.978..38.978 rows=62353 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2948kB
Buffers: shared hit=604 read=573 written=19
-> Seq Scan on stocklevels sl (cost=0.00..1800.53 rows=62353 width=8) (actual time=0.031..24.301 rows=62353 loops=1)
Buffers: shared hit=604 read=573 written=19
Planning Time: 0.543 ms
Execution Time: 24889.522 ms
But then execution plan for the same query when it took just 3 seconds;
GroupAggregate (cost=173586.52..193692.59 rows=804243 width=52) (actual time=2616.588..3220.394 rows=26848 loops=1)
Group Key: oitf."SKUId", sl."Available"
Buffers: shared hit=2 read=43929
-> Sort (cost=173586.52..175597.13 rows=804243 width=19) (actual time=2616.570..2813.571 rows=889937 loops=1)
Sort Key: oitf."SKUId", sl."Available"
Sort Method: quicksort Memory: 93001kB
Buffers: shared hit=2 read=43929
-> Hash Left Join (cost=20472.48..94701.25 rows=804243 width=19) (actual time=185.018..1512.626 rows=889937 loops=1)
Hash Cond: (oitf."SKUId" = sl."SKUId")
Buffers: shared hit=2 read=43929
-> Bitmap Heap Scan on "OrderItemTransactionFact" oitf (cost=17892.54..73123.18 rows=730380 width=15) (actual time=144.000..960.232 rows=689090 loops=1)
Recheck Cond: (("ReceivedDate" >= '1593561600'::double precision) AND ("ReceivedDate" <= '1633046400'::double precision))
Filter: ((NOT "hasComposite") AND ("transactionTypeId" = 2))
Rows Removed by Filter: 159949
Heap Blocks: exact=40431
Buffers: shared read=42754
-> Bitmap Index Scan on ix_orderitemtransactionfact_receiveddate (cost=0.00..17709.94 rows=844151 width=0) (actual time=134.806..134.806 rows=849039 loops=1)
Index Cond: (("ReceivedDate" >= '1593561600'::double precision) AND ("ReceivedDate" <= '1633046400'::double precision))
Buffers: shared read=2323
-> Hash (cost=1800.53..1800.53 rows=62353 width=8) (actual time=40.500..40.500 rows=62353 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2948kB
Buffers: shared hit=2 read=1175
-> Seq Scan on stocklevels sl (cost=0.00..1800.53 rows=62353 width=8) (actual time=0.025..24.620 rows=62353 loops=1)
Buffers: shared hit=2 read=1175
Planning Time: 0.565 ms
Execution Time: 3235.300 ms
The server config is; version: PostgreSQL 12.1, compiled by Visual C build 1914, 64-bit work_mem : 1048576kb shared_buffers : 16384 (x8kb)
Thanks in advance!
uj5u.com熱心網友回復:
它是檔案系統快取。慢的必須從磁盤讀取資料。快的只需要從記憶體中獲取資料,可能是因為慢的已經讀取了它并將其留在那里。您可以通過打開 track_io_timing 將其明確顯示在計劃中。
在 上建立索引應該會有所幫助("transactionTypeId","hasComposite","ReceivedDate"),可能對提高 Effective_io_concurrency 有很大幫助(取決于您的存盤系統)。
但大多數情況下,獲得更快的磁盤。
轉載請註明出處,本文鏈接:https://www.uj5u.com/qukuanlian/341814.html
標籤:sql postgresql performance data-warehouse
上一篇:從.accdb訪問緩慢啟動,但不是從Access應用程式中啟動
下一篇:實物短缺下的現貨白銀操作建議
