我們有一個從 Oracle 遷移的 Postgres 資料庫。有一個查詢在 Posgres 中很慢(8-11 秒),但在 Oracle 中卻很快(1-2 秒)。它涉及一個SELECT DISTINCT,其他幾個執行緒在 Postgres 中已將其標記為問題:1、2。我們可以做一些簡單的事情嗎,比如添加索引或其他東西,而無需進行重大重寫?
該查詢將表NED_PERSON_T與自身連接起來,以獲取UNIQUEIDENTIFIER其父值在子表中指定的所有行NIHSERVAO。如下SELECT DISTINCT所示。
select
distinct nedperson0_.ID as ID1_21_,
nedperson0_.BUILDINGNAME as BUILDING2_21_,
nedperson0_.CREATED_DATE as CREATED_4_21_,
nedperson0_.CURRENT_FLAG as CURRENT_5_21_,
/* ... 30 other columns ... */
from
NED_PERSON_T nedperson0_
inner join
NED_PERSON_T nedperson1_
on (
nedperson1_.NIHSERVAO=nedperson0_.UNIQUEIDENTIFIER
and nedperson0_.CURRENT_FLAG='Y'
and nedperson1_.CURRENT_FLAG='Y'
)
order by
nedperson0_.NIHMIXCASESN,
nedperson0_.MIXCASE_GIVENNAME
EXPLAIN ANALYZE給出了這個報告:
Unique (cost=362155.58..390755.66 rows=119167 width=1197) (actual time=8722.383..11202.495 rows=838 loops=1)
Sort Key: nedperson0_.nihmixcasesn, nedperson0_.mixcase_givenname, nedperson0_.id, nedperson0_.buildingname, nedperson0_.created_date, nedperson0_.description, ... ... )
Sort Method: external merge Disk: 78016kB
-> Gather (cost=103675.72..227466.24 rows=119167 width=1197) (actual time=2485.391..3100.424 rows=97678 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Hash Join (cost=102675.72..214549.54 rows=49653 width=1197) (actual time=2477.318..2880.555 rows=32559 loops=3)
Hash Cond: (nedperson1_.nihservao = nedperson0_.uniqueidentifier)
-> Parallel Index Scan using ned_person_t_current_flag_idx on ned_person_t nedperson1_ (cost=0.43..93430.46 rows=58867 width=11) (actual time=7.536..815.779 rows=46800 loops=3)
Index Cond: (current_flag = 'Y'::bpchar)
-> Parallel Hash (cost=93430.46..93430.46 rows=58867 width=1154) (actual time=1620.284..1620.285 rows=46800 loops=3)
Buckets: 4096 Batches: 64 Memory Usage: 1792kB
-> Parallel Index Scan using ned_person_t_current_flag_idx on ned_person_t nedperson0_ (cost=0.43..93430.46 rows=58867 width=1154) (actual time=4.602..1226.402 rows=46800 loops=3)
Index Cond: (current_flag = 'Y'::bpchar)
Planning Time: 3.986 ms
Execution Time: 11218.403 ms
索引:
CREATE INDEX ned_person_t_ao_test1 ON ned.ned_person_t USING btree (nihmixcasesn, mixcase_givenname, uniqueidentifier)
CREATE INDEX ned_person_t_current_flag_idx ON ned.ned_person_t USING btree (current_flag)
CREATE INDEX ned_person_t_id_idx ON ned.ned_person_t USING btree (id)
CREATE INDEX ned_person_t_mdslink_idx ON ned.ned_person_t USING btree (nihmdslinktoadnih)
CREATE INDEX ned_person_t_nedid_idx ON ned.ned_person_t USING btree (uniqueidentifier)
CREATE INDEX ned_person_t_orgacronym_idx ON ned.ned_person_t USING btree (nihorgacronym)
CREATE INDEX ned_person_t_orgstat_idx ON ned.ned_person_t USING btree (organizationalstat)
CREATE UNIQUE INDEX ned_person_t_pk ON ned.ned_person_t USING btree (id)
CREATE INDEX ned_person_t_sac_idx ON ned.ned_person_t USING btree (nihsac)
CREATE INDEX ned_person_t_temp_idx ON ned.ned_person_t USING btree (nihservao, organizationalstat, nihsac)
CREATE INDEX ned_person_t_temp_idx2 ON ned.ned_person_t USING btree (lower(nihmdslinktoadnih), current_flag, nihservao, organizationalstat, nihsac)
NED_PERSON_T 上的統計資料:
select count(*) from ned_person_t
總計數 = 1243733
select count(*) from ned_person_t where current_flag = 'Y';
CURRENT_FLAG = Y:142540
select count(*) from ned_person_t where nihservao is not null;
with non-NULL NIHSERVAO: 644852
select count(distinct nihservao) from ned_person_t;
Distinct NIHSERVAO: 928
uj5u.com熱心網友回復:
使 EXISTS 查詢快速運行所需的是一個索引(current_flag, nihservao)(或者可能是在 current_flag 上過濾的 nihservao 上的部分索引),以便它可以使用僅索引掃描來獲取 nihservao 值,而無需跳轉到表的隨機部分對于每一行。這需要保持桌子充分吸塵才能有效。
如果您的許多其他查詢也只關注當前資料,那么按 current_flag 對資料進行磁區可能是有意義的,以使相關資料更密集地打包。不過,這比僅僅創建一個索引更加激烈。
uj5u.com熱心網友回復:
您的EXISTS查詢看起來不錯。
1243733 行中只有 142540 行具有CURRENT_FLAG='Y'.
并且只有 644852 有nihservao IS NOT NULL. (與第一行相結合很重要。)
沒有uniqueidentifier IS NULL.
如果您定期查詢該條件,則部分索引 應該非常有效。我建議這兩個:
CREATE INDEX ON ned.ned_person_t (nihservao)
WHERE current_flag = 'Y' AND nihservao IS NOT NULL;
由于您只需要回傳uniqueidentifier并且fullname(就像您稍后評論的那樣) - 如果fullname不是太大 -覆寫索引可以提供更多幫助。(需要對桌子進行足夠的吸塵。)
CREATE INDEX ON ned.ned_person_t (uniqueidentifier) INCLUDE (fullname)
WHERE current_flag = 'Y';
例子:
- 具有主鍵和外鍵的查詢是否比僅具有主鍵的查詢運行得更快?
同時檢查您的所有索引是否實際上(仍然)被使用。這個是完全多余的,洗掉它:
CREATE INDEX ned_person_t_id_idx ON ned.ned_person_t USING btree (id)
id已經被唯一索引覆寫ned_person_t_pk(它實際上可能是一個PRIMARY KEY而不是唯一索引)。
最后的排序操作不太重要,因為只剩844下行。
uj5u.com熱心網友回復:
作為更多資訊,根據上述建議,我 (1) 使用 EXISTS 重寫了查詢,(2) 將 MEM_SIZE 增加到 100 MB。但這仍然需要 7 秒。下面是解釋。如果我之后按順序重新運行此查詢,由于快取需要更少的時間,但第一次運行大約需要 7 秒,如下所示。
更新: Exists 重寫確實將時間縮短了一半,大約 4 秒。這次運行并沒有真正的代表性,我們嘗試了更多(斷開/重新連接后)并且 EXISTS 平均好一半左右。
詢問:
select
nedperson0_.ID as ID1_21_,
nedperson0_.BUILDINGNAME as BUILDING2_21_,
nedperson0_.CREATED_DATE as CREATED_4_21_,
nedperson0_.CURRENT_FLAG as CURRENT_5_21_,
/* ... 30 other columns ... */
from
NED_PERSON_T nedperson0_
where
exists (select uniqueidentifier from ned_person_t nedperson1_
where nihservao = nedperson0_.uniqueidentifier and nedperson1_.current_flag = 'Y'
)
and
nedperson0_.current_flag = 'Y'
order by
nedperson0_.NIHMIXCASESN,
nedperson0_.MIXCASE_GIVENNAME
解釋分析:
Sort (cost=139290.39..139293.75 rows=1344 width=1195) (actual time=7627.351..7627.441 rows=844 loops=1)
Sort Method: quicksort Memory: 882kB
-> Nested Loop (cost=97523.64..139220.55 rows=1344 width=1195) (actual time=5112.442..7621.895 rows=844 loops=1)
-> HashAggregate (cost=97523.21..97531.04 rows=783 width=11) (actual time=5108.419..5109.041 rows=847 loops=1)
Group Key: nedperson1_.nihservao
-> Index Scan using ned_person_t_current_flag_idx on ned_person_t nedperson1_ (cost=0.43..97172.64 rows=140228 width=11) (actual time=2.376..4999.905 rows=142540 loops=1)
Index Cond: (current_flag = 'Y'::bpchar)
-> Index Scan using ned_person_t_nedid_idx on ned_person_t nedperson0_ (cost=0.43..53.21 rows=2 width=1152) (actual time=2.964..2.964 rows=1 loops=847)
Index Cond: (uniqueidentifier = nedperson1_.nihservao)
Filter: (current_flag = 'Y'::bpchar)
Rows Removed by Filter: 23
Planning Time: 10.259 ms
Execution Time: 7627.670 ms
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/422649.html
標籤:
下一篇:具有最大表單大小滯后的全屏表單
