在具有相同資料庫的兩個不同環境(本地機器和 heroku 上的生產)之間,我們發現相同的、相當簡單的查詢的執行時間存在很大差異。
查詢是:
SELECT "property_tax_bills".* FROM "property_tax_bills" INNER JOIN "property_tax_bill_parsed_addresses" ON "property_tax_bills"."id" = "property_tax_bill_parsed_addresses"."property_tax_bill_id" WHERE "property_tax_bill_parsed_addresses"."parsed_address_id" = 2 AND "property_tax_bills"."statement_date" = '2021-11-20';
property_tax_bills 是一個大表(4400 萬條記錄),連接表 property_tax_bill_parsed_addresses 也同樣大。
通過 psql 本地運行的 EXPLAIN ANALYZE 命令回傳以下內容:
Gather (cost=1105.30..64845.64 rows=219 width=147) (actual time=15.054..18.941 rows=101 loops=1)
Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=25539
-> Nested Loop (cost=105.30..63823.74 rows=91 width=147) (actual time=12.147..12.291 rows=34 loops=3)
Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
Inner Unique: true
Buffers: shared hit=25539
Worker 0: actual time=10.752..10.898 rows=36 loops=1
Buffers: shared hit=7647
Worker 1: actual time=11.014..11.162 rows=28 loops=1
Buffers: shared hit=6483
-> Parallel Bitmap Heap Scan on public.property_tax_bill_parsed_addresses (cost=104.73..32879.67 rows=3672 width=8) (actual time=0.634..3.758 rows=1442 loops=3)
Output: property_tax_bill_parsed_addresses.id, property_tax_bill_parsed_addresses.property_tax_bill_id, property_tax_bill_parsed_addresses.parsed_address_id, property_tax_bill_parsed_addresses.raw_address, property_tax_bill_parsed_addresses.processed_address, property_tax_bill_parsed_addresses.created_at, property_tax_bill_parsed_addresses.updated_at, property_tax_bill_parsed_addresses.is_verified, property_tax_bill_parsed_addresses.error_message
Recheck Cond: (property_tax_bill_parsed_addresses.parsed_address_id = 2)
Heap Blocks: exact=1745
Buffers: shared hit=3912
Worker 0: actual time=0.041..2.924 rows=1295 loops=1
Buffers: shared hit=1171
Worker 1: actual time=0.045..3.356 rows=1099 loops=1
Buffers: shared hit=987
-> Bitmap Index Scan on part_i_ptbpa_o_p_a_id_where_not_null (cost=0.00..102.53 rows=8812 width=0) (actual time=1.049..1.049 rows=4325 loops=1)
Index Cond: (property_tax_bill_parsed_addresses.parsed_address_id = 2)
Buffers: shared hit=9
-> Index Scan using property_tax_bills_pkey on public.property_tax_bills (cost=0.56..8.43 rows=1 width=147) (actual time=0.006..0.006 rows=0 loops=4325)
Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
Index Cond: (property_tax_bills.id = property_tax_bill_parsed_addresses.property_tax_bill_id)
Filter: (property_tax_bills.statement_date = '2021-11-20'::date)
Rows Removed by Filter: 1
Buffers: shared hit=21627
Worker 0: actual time=0.006..0.006 rows=0 loops=1295
Buffers: shared hit=6476
Worker 1: actual time=0.007..0.007 rows=0 loops=1099
Buffers: shared hit=5496
Planning:
Buffers: shared hit=498
Planning Time: 1.750 ms
Execution Time: 19.000 ms
在生產環境中,通過 heroku pg:psql 運行的相同命令回傳:
Nested Loop (cost=0.20..7.50 rows=1 width=147) (actual time=15.895..2152.165 rows=101 loops=1)
Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
Inner Unique: true
Buffers: shared hit=5581088
-> Index Scan using index_property_tax_bills_on_statement_date on public.property_tax_bills (cost=0.09..3.38 rows=1 width=147) (actual time=0.034..208.051 rows=1110860 loops=1)
Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
Index Cond: (property_tax_bills.statement_date = '2021-11-20'::date)
Buffers: shared hit=26788
-> Index Scan using i_pr_tax_bill_p_a_o_p_r_b_id on public.property_tax_bill_parsed_addresses (cost=0.11..4.12 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=1110860)
Output: property_tax_bill_parsed_addresses.id, property_tax_bill_parsed_addresses.property_tax_bill_id, property_tax_bill_parsed_addresses.parsed_address_id, property_tax_bill_parsed_addresses.raw_address, property_tax_bill_parsed_addresses.processed_address, property_tax_bill_parsed_addresses.created_at, property_tax_bill_parsed_addresses.updated_at, property_tax_bill_parsed_addresses.is_verified, property_tax_bill_parsed_addresses.error_message
Index Cond: (property_tax_bill_parsed_addresses.property_tax_bill_id = property_tax_bills.id)
Filter: (property_tax_bill_parsed_addresses.parsed_address_id = 2)
Rows Removed by Filter: 1
Buffers: shared hit=5554300
Planning Time: 0.225 ms
Execution Time: 2152.224 ms
如您所見,本地查詢計劃要復雜得多,而且估計時間要長得多。然而,在 prod 上,查詢計劃簡單直觀,估計速度很快,但不知何故實際上執行速度比本地慢 2 個數量級。該應用程式正在實時運行,但流量很少(4-5 個用戶,最多 15 個請求/分鐘)。
關于每臺機器規格的一些附加資訊:
Local 有 64GB 的 RAM 和 12 個在 NVME 上運行的內核,大約能夠達到 5 GB/s 的讀/寫 - Postgres 13.4 版
生產是 Heroku 上的標準 9 764GB RAM postgres 實體。- Postgres 客戶端版本 13.4
uj5u.com熱心網友回復:
這是你的問題:
在 public.property_tax_bill_parsed_addresses 上使用 i_pr_tax_bill_p_a_o_p_r_b_id 進行索引掃描(成本=0.11..4.12 行=1 寬度=8)(實際時間=0.002..0.002 行=0 回圈=1110860)...索引。 id) 過濾器:(property_tax_bill_parsed_addresses.parsed_address_id = 2)
過濾器移除的行數:1
它正在執行 1110860 次索引掃描,并在成功找到資料后洗掉大部分資料。
將 parsed_address_id 添加到該索引中,以避免之后的過濾。
CREATE INDEX idx_name_of_your_index ON property_tax_bill_parsed_addresses (property_tax_bill_id,parsed_address_id);
有了這個索引后,查詢計劃會改變嗎?
uj5u.com熱心網友回復:
感謝Frank Heikens,我們發現沒有使用更有效的索引。我們在 和 上都有唯一索引property_tax_bill_id和唯一索引(property_tax_bill_id, parsed_address_id)。
通過洗掉冗余唯一索引property_tax_bill_id然后重新添加它,它促使查詢計劃器使用更快的計劃 - 這是新的結果計劃:
Nested Loop (cost=0.20..652.83 rows=3 width=147) (actual time=26.039..26.505 rows=101 loops=1)
Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
Inner Unique: true
Buffers: shared hit=25524
-> Index Scan using part_i_ptbpa_o_p_a_id_where_not_null on public.property_tax_bill_parsed_addresses (cost=0.09..208.09 rows=108 width=8) (actual time=0.026..4.929 rows=4325 loops=1)
Output: property_tax_bill_parsed_addresses.id, property_tax_bill_parsed_addresses.property_tax_bill_id, property_tax_bill_parsed_addresses.parsed_address_id, property_tax_bill_parsed_addresses.raw_address, property_tax_bill_parsed_addresses.processed_address, property_tax_bill_parsed_addresses.created_at, property_tax_bill_parsed_addresses.updated_at, property_tax_bill_parsed_addresses.is_verified, property_tax_bill_parsed_addresses.error_message
Index Cond: (property_tax_bill_parsed_addresses.parsed_address_id = 2)
Buffers: shared hit=3899
-> Index Scan using property_tax_bills_pkey on public.property_tax_bills (cost=0.11..4.12 rows=1 width=147) (actual time=0.005..0.005 rows=0 loops=4325)
Output: property_tax_bills.id, property_tax_bills.owner_name, property_tax_bills.property_address, property_tax_bills.mailing_address, property_tax_bills.statement_date, property_tax_bills.pluto_record_id, property_tax_bills.created_at, property_tax_bills.updated_at, property_tax_bills.bbl
Index Cond: (property_tax_bills.id = property_tax_bill_parsed_addresses.property_tax_bill_id)
Filter: (property_tax_bills.statement_date = '2021-11-20'::date)
Rows Removed by Filter: 1
Buffers: shared hit=21625
Planning:
Buffers: shared hit=20
Planning Time: 0.593 ms
Execution Time: 26.554 ms
它現在正在使用其他似乎可以加快速度的部分索引。雖然仍然沒有使用最快的索引(跨 parsed_address_id 和 property_tax_bill_id 的多列),但性能提升足以滿足我們的需求。
這仍然有點神秘,因為我不確定洗掉和重新添加是如何徹底改變查詢計劃的,但我不想在嘴里看禮物。
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/404424.html
標籤:
