我有 2 個存盤網路資訊的 SQL Server 表,EF 架構是:
public partial class edge
{
public long edge_id { get; set; }
public string source { get; set; }
public string target { get; set; }
public Nullable<System.DateTime> edgedate { get; set; }
}
public partial class node
{
public string node_id { get; set; }
public string name { get; set; }
public string address { get; set; }
}
我正在從 UI 傳遞特定于邊緣和節點的過濾器以構建到 SQL 查詢中,如下所示:
select *
from [dbo].[Nodes]
where name = 'John Doe'
or address = '123 Fake Street'
select *
from [dbo].[Edges]
where edgedate >= '01/12/2020 00:00:00'
and edgedate <= '01/12/2021 23:59:59
然而,這些查詢必須考慮整個網路,即必須將節點過濾器應用于邊緣,反之亦然 -
-- nodes example with edge filters applied
select *
from [dbo].[Nodes]
where name = 'John Doe'
or address = '123 Fake Street'
and node_id in (select source
from EDGESTEMP
where edgedate >= '01/12/2020 00:00:00'
and edgedate <= '01/12/2021 23:59:59'
union
select target
from EDGESTEMP
where edgedate >= '01/12/2020 00:00:00'
and edgedate <= '01/12/2021 23:59:59')
這在小規模網路上運行良好,但是如果我處理的網路有 100 萬條邊和 50 萬個節點,則運行這些查詢的性能會受到影響,因為檢查每個實體中的另一個表時的in陳述句。
我已經在查詢的所有附屬列上添加了索引,但是需要知道是否有更有效的方法來做到這一點?
附加資訊
查詢計劃 -這里
Clustered indexes are set on each primary key i.e. node_id and edge_id and non clustered set on the rest e.g. -
CREATE NONCLUSTERED INDEX [NonClusteredIndex-20211017-194859] ON [dbo].[NODES]
(
[name] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO
uj5u.com熱心網友回復:
您可以使用exists來進行更有效的查詢,而不是使用union。
此外,您不應該使用不明確的日期文字。我不確定這些日期是 1 月 12 日還是 12 月 1 日。另外,對于日期時間范圍查詢,您不應該使用 >= 和 <=,而是使用 >= 和 <。您可以在代碼中看到這些調整:
select *
from [dbo].[Nodes] n
where (name = 'John Doe'
or address = '123 Fake Street')
and exists (select *
from EDGESTEMP e
where (n.node_id = e.source or n.node_id = e.target)
and e.edgedate >= '20200112'
and e.edgedate < '20210113');
順便說一句,我假設您已經在源、目標和邊緣日期上建立了索引。如果沒有創建它們。
uj5u.com熱心網友回復:
首先,您的查詢似乎有一個邏輯錯誤:or.
其次,UNION ALL通常比 好UNION,盡管在半連接中通常無關緊要,例如IN或EXISTS
select n.*
from [dbo].[Nodes] n
where (n.name = 'John Doe'
or n.address = '123 Fake Street')
and node_id in (
select source
from EDGES
where edgedate >= '01/12/2020 00:00:00'
and edgedate <= '01/12/2021 23:59:59'
union all
select target
from EDGES
where edgedate >= '01/12/2020 00:00:00'
and edgedate <= '01/12/2021 23:59:59'
);
最后,對于這個查詢,你可能應該有以下索引
NODES (name) INCLUDE (node_id)
NODES (address) INCLUDE (node_id)
EDGES (edgedate) INCLUDE (source, target)
這種or情況可能仍會導致問題,因為您可能仍會在 上進行索引掃描NODES。如果是這樣,您可能需要重寫查詢以強制使用索引聯合。
select n.*
from (
select *
from [dbo].[Nodes] n
where n.name = 'John Doe'
union
select *
from [dbo].[Nodes] n
where n.address = '123 Fake Street'
) n
where node_id in (
select source
from EDGES
where edgedate >= '01/12/2020 00:00:00'
and edgedate <= '01/12/2021 23:59:59'
union all
select target
from EDGES
where edgedate >= '01/12/2020 00:00:00'
and edgedate <= '01/12/2021 23:59:59'
);
轉載請註明出處,本文鏈接:https://www.uj5u.com/qiye/328794.html
標籤:sql-server nodes edges
